Data Warehouse Schemas Of Star Snowflake, And Galaxy – Database Administration Essentials

Data Warehouse Schemas Of Star Snowflake, And Galaxy – Database Administration Essentials

Data Warehouse Schemas of Star Snowflake, and Galaxy – Database Administration Essentials

 

We have already discussed multidimensional schema in another article in this series. These schemas are designed to effectively model data warehouse systems. These can also address the needs of even the larger databases designed for analytical purposes (OLAP or online analytical processing databases). Data warehousing schema is an important consideration to make while trying out multidimensional schemas, and here we will discuss various types of warehouse schemas to consider.

 

Different types of data warehouse schemas

 

Here are the four major types of multidimensional data warehouse schemas, each of which comes with some unique benefits.

 

  1. Star Schema
  2. Snowflake Schema
  3. Galaxy Schema
  4. Cluster schema

 

Star Schema

 

In the data warehouse Star Schema, the center of a star can have the fact table and many tables associated with it specifying various dimensions. This is known as a star schema, and the structure of this looks like of a star. Star Schema data model is also the simplest among the warehouse schemas. It is also called Star Join Schema, which can effectively be used for querying on a larger set of data.

 

Star schema characteristics

 

  • The dimensions of Star schema are represented with a single one-dimensional table.
  • This dimension table will contain the attributes set.
  • Dimension table joins to fact table by the usage of a foreign key.
  • Different dimension tables are not joined to one another.
  • Fact table contains the measure and key.
  • Star schema is very easy to understand.
  • Star schema ensures optimal disk usage.
  • Dimension tables are not normalized.
  • Star schema is also supported widely by business intelligence tools.

 

Snowflake Schema

Snowflake Schema is another data warehouse that can be identified as a logical arrangement of various tables in the multidimensional DB like that in the ER diagram, which looks like a snowflake. Snowflake Schema is an extension of the Star Schema, which adds more dimension to it. Dimension tables in Snowflake Schema are normalized and can split data into different add-on tables.

 

Major characteristics of Snowflake Schema are

 

  • It uses only smaller disk space.
  • Easy to implement a specific dimension added to the Schema.
  • With many tables, query performance is restricted.

 

One major challenge you may face with snowflake Schema usage is that you have to put in more maintenance efforts into it with more and more lookup tables.

 

If you have to choose between Star and Snowflake or identify what will be apt for your data warehousing project, you may assist expert consultants like RemoteDBA.com.

 

The differences between Star and Snowflake Schemas

 

Here is a side-by-side comparison of the key differences between Star vs. Snowflake schema.

 

Stat Schema

Snowflake Schema

Different dimensions related hierarchies are stored in the dimensional table.

Hierarchies are divided into different tables.

 

 

It consists of a fact table which is surrounded by many dimension tables.

One fact table with surrounding dimension table which is again surrounded by different dimension tables.

Only a single join to create the relation between fact and dimension tables

Many joins needed to fetch needed data

Very simple and flexible database design

Complex database design

Single dimension table containing aggregated data

Data is split into various dimension tables.

Denormalized data structure

Normalized data structure

Higher data redundancy

Lower data redundancy

Cube processing is much faster.

Cube processing may slow down due to complex joins

Offering queries that are higher performing using the query optimization

Schema is represented by a centralized fact table that is not connected in various dimensions.

 

 

Galaxy Schema

 

Galaxy Schema usually consists of two fact tables, which share the dimension tables among them. This is also known as Fact Constellation Schema. This schema is viewed as a collection of different Star Schemas, symbolically named Galaxy Schema. The shared dimensions in Galaxy Schema are known as Conformed Dimensions.

 

Galaxy Schema Characteristics:

 

  • Dimensions of Galaxy Schema are divided into separate dimensions, which are specified based on various hierarchical levels. Ex: A table for geography can have various hierarchy levels as region, city, state, country, and galaxy Schema with four dimensions.
  • It is possible to build Galaxy schema by splitting one-star schema into many Star schemas.
  • Dimensions of Galaxy schema are larger as needed to build various hierarchy levels.
  • Galaxy schema is helpful in aggregating various fact tables to understand easily.

 

Cluster schema

 

With Snowflake schema, there are many expanded hierarchies in place. But having complex hierarchies may demand more joins and will increase the complexity of the schema. Star Schema contains collapsed hierarchies, which will further cause redundancy. So, an ideal solution is to have a fine balance between Star and Snowflake schemas, and there comes the relevance of Star Cluster Schema.

 

In Cluster Schema, there are overlapping dimensions that are found as forks in the hierarchies. A fork happens while a specific entity acts as the parent and splits into various dimensional hierarchies. The Fork entities are further identified as different classifications featuring one-to-many relations.

 

To conclude, we will have an overview of the above schemas we discussed. Multidimensional schema is the most comfortable model of data warehouse systems. Star schema is the simplest type of schema, which has a structure resembling a star. Snowflake Schema can be considered an extension of the Star schema, which adds more dimensions. In Star schema, only a single join defines the relationship between the fact table and dimension tables. In the typical Star Schema structure, there is a fact table surrounded by many dimension tables. Snowflake schema has a structure adding to the Star Schema as dimension tables further surround the dimension table. Snowflake Schema needs many joins to fetch data. Galaxy Schema consists of two fact tables that shared dimension tables. This is also called Fact Constellation Schema. Finally, the Star Cluster Schema contains various attributes of Star and Snowflake Schemas.

 

While planning for a schema for your data warehousing project, you need first to analyze your data's nature and complexity to see which will match the best to your needs.

    Leave Your Comment

    Your email address will not be published.*