The amount of data we generate each day is mind-boggling. According to a World Economic Forum report, we reached 44 zettabytes of data in 2020. By 2025, we will generate about 463 exabytes of data each day. These numbers are bigger than most of us care to think about.

They are just estimations, but we can use them to get an insight into how much digital information we, on this planet, produce.

Your business, however small, generates electronic data. This includes personal data from your customers, engagement data, behavioral data, and even attitudinal data, which measures things like customer satisfaction, purchase criteria, and product desirability. You also have supplier and stakeholder data.

It is no wonder that data has become such a sensitive issue. One of the biggest responsibilities of any business is to safeguard the information it collects. Unfortunately, data has become a commodity and a very valuable one too.

Some of the biggest companies in the world trade on it: Google collects user data for advertising, and Amazon collects customer data to monitor trends, suggest similar products and send out offers to prospective buyers.

Many advertising companies actively buy data for their customers, using it to send ads to what they deem profitable demographics.

All this activity is tame when we consider the darker side of data use. In recent years, hackers have changed the data landscape. They steal massive troves of data from companies and sell it to the highest bidder.

The statistics are enough to give business owners and managers sleepless nights. According to a DataProt report, more than 1.7 billion corporate records were lost to hackers in 2019. A ransomware attack happens every 14 seconds. The same report estimates that the world will spend more than $10 billion annually on data security by 2027.

This may not be good news for businesses, but it is encouraging news for anyone studying business analytics and data science. The course equips students to deal with big data, helping companies adapt ethical and secure ways to handle all types of data.

Online business analytics programs, such as the one provided by St Bonaventure University, enable you to equip yourself with the necessary skills to deal with subjects like data visualization and analysis, prescriptive and predictive analytics, and business data strategy. Students gain advanced knowledge of data analytics, data protection, and ethics, all skills that are in demand. They also learn how to use data in problem-solving, how to collect, analyze and store large amounts of data, and solve problems using current technologies. Online courses are recommended because they are often shorter and more convenient.

Before enrolling, it is a good idea for students to get an insight into the ethics of data collection, storage, and analysis.

What is data ethics?

As the field is relatively new, the definition of data ethics tends to change every few years.

However, there is a consensus among experts that it can be described as the moral obligation that encompasses gathering, analyzing, and storing data.

It is an area that should be of utmost concern to every business. Getting data ethics wrong can be costly. It isn’t unusual for small businesses to close down because of a data breach. Code Spaces, Nirvanix, and MyBizHomepage are just three examples of small businesses that were doing well until they were crippled by data breaches that rendered them useless.

In today’s world, reputation is everything, and if customers cannot trust a company to secure their data, they stop interacting with it.

Big companies are no exception, despite spending tens of millions of dollars every year on data protection.

GDPR principles, used to define data collection and protection for businesses in Europe, can help us understand the meaning of data ethics. They outline that businesses must act transparently and with consent, and they must use the data they collect for the stated purpose.

While American businesses aren’t guided by GDPR rules, we have the California Consumer Privacy Act (CCPA) which governs how data is handled in California. Other states don’t have laws to govern consumer protection as yet and are working on developing guidelines and laws on what businesses can and cannot do with consumer data.

Many have taken it upon themselves to develop guidelines for handling the data they collect. With the help of data experts, they ensure that all their information is safe and handled accordingly without breaching consumer privacy.

Principles that govern data collection

Knowing the principles that govern data collection is the first step in creating data policies that will protect not only consumers but also businesses.

Students in online business analytics programs will encounter them in the course, and it is important to understand what they are and why they matter.

  • Consent

This means getting the consent of the user before collecting their information and allowing them control over how you can use that data.

Most businesses get consent by getting users to click “I Agree” before they sign up for a service. Unfortunately, most users never read through consent agreements. They are a little too comprehensive, and the print is too small for them to bother.

It is the job of data managers to ensure that they have a responsible data policy that sets out to protect users and defines how a business can use data.

Just because most people do not take the time to read user agreements does not mean they should be taken advantage of. It is the responsibility of the business to come up with sound policies that allow users full control of their information.

Some companies create manipulative agreements. They provide services but offer terms that force users to compromise their data. This isn’t a sound policy, and it can have dire consequences.

Before you issue a user agreement that allows you to collect user data, there are three important questions you must ask:

  1. Do you have express permission to collect personal information?
  2. Are users aware that you are collecting their data?
  3. Can they withdraw their consent at any time, and if they do, will you cease using their data right away?

  • Transparency

Users should know exactly how a business plans to use their data. If, for example, you would like to start collecting data on user behavior so that you can use it to personalize ads, it is wise to let the users know. That way, when they see ads from you, they will remember that they opted in.

If you plan to sell the data to third parties, make it clear on the opt-in form. Those who don’t mind will fill out the form, and those who do will either leave or provide limited information.

You should also let users know that they can opt out should they choose to do so.

  • Privacy

A user may give you consent to collect, analyze, and store their data, but with this consent comes the tacit implication that their data is not available to the public.

There are numerous examples of user data being sold on the dark web. Apart from financial implications for users, it can lead to other more serious problems like stalking or even home invasion.

As a data expert, you must strive to protect personally identifiable information or PII. This is any kind of data that can be linked to a user, and it includes their name and date of birth, phone number and home address, social security number, credit card number, bank account number, and passport information.

When businesses collect this kind of data it is often stored in huge databases on company servers or the cloud. It is up to you, as the data professional, to secure it with file encryption, dual-authentication passwords, and any other necessary means.

As we have seen, even with the best protection, data is sometimes lost. You can have all protections in place, but because hackers are a step ahead, they may access user data.

Data managers get around this problem by de-identifying datasets. This means that all PII is removed, making the data anonymous.

  • Intention

Ethics and intention go hand in hand. For data collection and handling, it means that before you collect data, you must clearly define why you need it in the first place. What positive changes will it bring to your business, and might it have unintended consequences?

If you have a business that deals with toys, for example, it makes sense to collect only limited parent data. You may save their names and shipping addresses as well as their email addresses.

However, if you start to collect children’s data, you cross the line, and your intent is no longer clear. Children do not have buying power, so there isn’t a good reason for you to include them in your database.

Intention is one of the biggest problems facing social media companies. They are indiscriminate about the data they collect and often end up sending inappropriate marketing messages to children.

If certain data points do not help you solve a problem you are currently facing, then you should not include them.

As a guide, ask yourself the following questions before you embark on any data collection activity:

Does this data help me achieve my overall aim?

Could this data, in conjunction with other information I collect, be used to reveal personally identifiable information?

How are algorithms affecting data collection and analysis?

The world is ruled by algorithms. Today, all programmers have to do is write a piece of code that achieves a certain goal and let it do its job. Data collection uses algorithms quite heavily.

It is important to keep in mind that algorithms do what they are told to do. If a programmer creates code that collects certain types of data, it will keep doing that until it is instructed otherwise.

Some algorithms are trained using non-representative datasets, and when activated, they skew information in a certain way.

In some cases, they display bias, and it takes a lot of analysis and careful consideration to discover and correct those biases.

Algorithms are designed to learn from feedback. They pick up what you put in and always work within those parameters.

Imagine, for example, that you have a clothing line and tend to send certain fashions to certain countries. The algorithm will pick on that, and when you activate it in the future it will do what you have been doing; it will send the same fashions to the same countries because that is what you taught it to do.

What does this mean for data collection? It means you should be careful about the data you collect. If you do not need PII, you should not include it in your datasets because when you activate an algorithm, it will gather personal information that could put users at risk.

Common data collection pitfalls for businesses

There is a ton of information out there about how best to collect and store data, but businesses make mistakes. Statistics about hacking and leaks make it clear that it is only a matter of time before a business pays for poor data practices.

A good data manager is at the top of their game at all times, making sure to watch out for these pitfalls:

  • Assuming that data ethics don’t apply

This is a common problem in small businesses. CEOs and data managers assume that because they don’t collect much data, they shouldn’t pay attention to data ethics.

This is a big mistake. Most victims of hacking and ransomware are small businesses. There isn’t much sense in taking risks.

  • Leaving it all to the data expert

It is assumed that data scientists and managers will provide 100% protection.

Data is the purview of everyone in the business. Most ransomware and phishing attacks are sent through email.

Every employee should understand data ethics, be familiar with the company’s data policies and learn to recognize risk when they see it.

  • Chasing short-term gains

When times are tough, businesses do whatever they can to survive, and data ethics can go through the window. A business may decide to share user data because it is a quick way to raise revenue.

Short-term gains can lead to long-term repercussions. If you want to build a reputable and reliable brand, ensure that customer data is always protected.

How can businesses create strong and ethical data policies?

Every business faces data management challenges. Each year, with more data, those challenges grow. Companies can handle these challenges using the following tips:

Have specific rules governing how data is handled within your company

Each business is unique, so create a data policy tailored to the data you intend to collect.

Each employee should be aware of the rules governing information handling, analysis, and storage. A shared framework eliminates loopholes and makes it easy to identify weak points in the data pipeline.

  • Build a diverse data-handling team

Data scientists and managers bring the technical expertise you need to keep data safe, but they usually lack inside knowledge of how different departments work.

A good data team represents every department. Line managers shed light on the different data sets that are important, those that the company can do without, and even the different ways data should be analyzed.

  • Have scheduled data audits

How safe is your pipeline? How often do you check it for leaks? Data audits find problem points and help deal with issues before they occur.

Engage external auditors once or twice a year to check your data collection and storage systems for soundness. Take steps to correct weaknesses and improve data collection methods.

  • Stay abreast of the technology

The data landscape is constantly changing, especially when it comes to algorithms. Have your ear to the ground at all times.

Strive to know the latest data practices, what the latest algorithms can do, and any new technological developments that can improve how businesses interact with user data.

  • Seek secure storage solutions

Many companies still keep customer data in servers that are in their offices, where they can be accessed by anyone who makes it past security.

The use of cloud storage has eliminated the need for in-house servers. Clouds are safer and harder to break into. They are also cheaper as they eliminate the need to purchase and maintain servers.

Choose a cloud solution that suits your needs and let them take care of your data.

  • Avoid careless mistakes

Most data loss occurs because someone somewhere was careless. They may be using a password that is easy to guess, or they forget to log out, leave their computers open to public Wi-Fi systems or even misplace hard drives that contain important information.


Data ethics and security should be at the forefront of every business. As the world generates more data, it has become more important than ever to guarantee its security and ensure that it is collected and handled ethically. With the right qualifications, you can join this growing field and help American businesses implement secure and ethical information collection and storage policies.