BigObject analytics is designed to analyze multi-dimensional data in star or snowflake schema. Multi-dimensional data is organized into tables. There are two types of tables in BigObjects - the dimensional table and the fact table.


Dimension Table

A dimension consists of a set of items with certain descriptive properties called attributes. For example, a business data is multi-dimensional. It could contain information such as product, channel and price. When such information is organized and stored in a table object, it is referred to as a dimension table in BigObject.

Let's place this in the ABC company example. As ABC company starts its business, the staff wants to first create a database for future analytics. The first thing he may want to create is the database of all of its members who happen to be customers of ABC company's clients. A dimension table called "Customer" may be created to look like this:

id name language state company gender age
1 Ryan Mills Korean South Carolina Tazz Male 55
2 Christopher Thompson Māori District of Columbia Thoughtblab Male 13
3 Shawn Rose Hindi District of Columbia Livetube Male 33
4 Julia Kennedy Italian Florida Fivespan Female 30
5 Virginia Reynolds Persian North Carolina Dabfeed Female 42

To refer an attribute a in a dimension table D, the user needs to specify as D.a. For example, Customer.Gender represents Gender attribute in Customer dimension table.

Note
Table name and attribute name are case sensitive in BigObject.

Fact Table

Fact is the data needed to be analyzed. When fact data stored in a table object, it is referred to as a fact table in BigObject.
When ABC company staff creates the database, he will also store the transaction records into a table for future analysis. A fact table called "sales" is created by the staff that looks like the following:

order_id Customer.id Product.id channel_name Date qty total_price
1 3226 2557 am/pm 2013-01-01 00:04:05 8 52.24
2 6691 2631 am/pm 2013-01-01 00:11:27 4 39.72
2 6691 1833 am/pm 2013-01-01 00:21:03 1 6.9
3 4138 1626 am/pm 2013-01-01 00:30:22 5 42.1
3 4138 375 am/pm 2013-01-01 00:35:44 6 67.26
3 4138 3336 am/pm 2013-01-01 00:45:12 8 41.68
3 4138 736 CVS 2013-01-01 00:55:34 6 56.4
4 1292 4434 7-11 2013-01-01 01:06:00 6 86.64

Note that column Customer.id and Product.id refers to the column id in Customer dimenstion table and id in Product dimenssion table respectively.


Data Types

BigObject supports following data types:

Type Description
STRING Encoded string ended with NULL(0) character
CHAR Fixed-length string
VARSTRING Variable-length string; suitable for non-repeatable strings to save space
BYTE Single character(ASCII range 32-126)
INT8 8-bit integer
INT16 16-bit integer
INT32 32-bit integer
INT64 64-bit integer
FLOAT 4-byte floating point
DOUBLE 8-byte double precision floating point
DATE32 Year, month, and day of month (4 bytes long)
DATETIME32 Date and time (4 bytes long)
DATETIME64 Date and time (8 bytes long)
TIMESTAMP Timestamp with sub-second precision (8 bytes long)
IPv4 Internet Protocol version 4
IPv6 Internet Protocol version 6
BINARY Fixed length binary
VARBINARY Variable-length binary
POINT A geometry location with coordinate X and Y
LINESTRING A sequence of points representing connected line segments
POLYGON A sequence of points representing an exterior bounding ring and zero or more interior rings.
MULTIPOLYGON A collection of zero or more polygons.

The default and maximum length of each string type is listed below:

Type default length maximum length
STRING 63 1023
CHAR 32 786432
VARSTRING 255 786432

The maximum allowed number of points in LINESTRING, POLYGON, and MULTIPOLYGON are around 48000 points.

Below is the range for date related data types:

unit DATE32 DATETIME32 DATETIME64
year -32768 - 32767 2000 - 2063 -32768 - 32767
month 1 - 12 1 - 12 1 - 12
day of month 1 - 31 1 - 31 1 - 31
hours N.A. 0 - 23 0 - 23
minutes N.A. 0 - 59 0 - 59
seconds N.A. 0 - 59 0 - 59