The Structure of Data

The structure of data

Data is everywhere and it can be stored in lots of ways. Two general categories of data are:

Structured data: Organized in a certain format, such as rows and columns.
Unstructured data: Not organized in any easy-to-identify way.

For example, when we rate our favorite restaurant online, we’re creating structured data. But when we use Google Earth to check out a satellite image of a restaurant location, we’re using unstructured data.

Here’s a refresher on the characteristics of structured and unstructured data: data

Structured Data

Structured data is organized in a specific way, which makes it easier to store and find the information needed for businesses. When you export structured data, the format remains intact.

Unstructured Data

Unstructured data doesn’t have a clear organization. There is much more unstructured data than structured data in the world. Examples of unstructured data include videos, audios, text files, social media content, images, presentations, PDFs, survey responses, and websites.

The Fairness Issue

Because unstructured data lacks organization, it becomes challenging to search, manage, and analyze. However, recent advancements in artificial intelligence and machine learning are improving this situation. The new challenge for data scientists is to ensure that these tools are fair and unbiased. If not, some elements of the data will be given more importance or representation than others. An unfair dataset leads to skewed outcomes, low accuracy levels, and unreliable analysis, which we should avoid.

The Structure of Data