BigObject allows user-defined functions to be used within a query statement on table objects as well. Examples will be used to illustrate how it works.
IN GROUP BY
Assuming that a user would like to compare weekday and weekend sales, this question can be easily answered by the following SELECT statement:
SELECT wkd(Date), SUM(total_price) FROM sales GROUP BY wkd(Date)
An example result table is as shown below:
As can be seen from the result, the weekday and weekend sales is about 3.9M
and 1.55M respectively. The query is enabled by applying function
determine whether it is weekday (denoted by 0) or weekend (denoted by 1) for
wkd() is quite simple:
function wkd(d) wd = d:weekday() if (wd >= 1 and wd <= 5) then return 0 else return 1 end end
The argument d is a DateTime type variable. We use built-in function
weekday() to obtain its week day. It returns 0 if it is weekday (1 to 5) or
1 if it is weekend (0 or 6).
User-defined functions can be used with other attributes in the GROUP BY clause, such as:
SELECT Product.brand, wkd(Date), SUM(total_price) FROM sales GROUP BY Product.brand, wkd(Date)
The above SELECT statement returns the weekday and weekend sales for each product brand.
User-defined functions can also be used in WHERE clause as a filter. For example, if the user only wants to obtain weekend sales, he can do:
SELECT SUM(total_price) FROM sales WHERE wkd(Date) = 1
BigObject does not support expressions as selected values. However,
it can be achieved by table programming. Assuming that the user would like to calculate the unit price for each transaction, he can
function unit_price(qty, total_price) return qty / total_price; end
Then the following SELECT can be issued together with function
unit_price() to obtain the unit price for each transaction.
SELECT unit_price(qty, total_price) from sales
Run a code script (for advanced user).
The variable of the return value is "_ret", and the other arguments will be arg, arg,... e.g. performing a user-defined equation based on qty and total_price columns
SELECT eval("_ret = (arg+arg)/arg" , qty , total_price) FROM sales LIMIT 10 SELECT * FROM sales WHERE eval("_ret = (arg*0.8 +arg*0.2)/arg" , qty , total_price) > 3 LIMIT 10
Compute the intersection between two strings.
The basic concept is to make two histograms (bins:0~255) from two strings which seen as byte arrays Then applying a measurement method to these histograms.
The strinter function includes three argument. The first argument specifies a string column for comparing. The second one is a comparing string. The third one is used to set a comparison method.
We implement three methods as follows:
count : return the intersection number between two histograms. (default) norm1 : count / length of the comparing string norm2 : count / (length of the first string * length of the second string)
e.g. list all customer names and the intersection of the string "Amy Mil"
SELECT name ,strinter(name , "Amy Mil") FROM Customer
e.g. list top 5 customer names based on the intersection of the string "Amy Mills sadas" (using "norm2" method)
SELECT name ,strinter(name , "Amy Mills sadas", "norm2") FROM Customer ORDER BY strinter(name , "Amy Mills sadas", "norm2") DESC LIMIT 5