Wednesday, July 1, 2020

Streaming data processing systems - part 1

Problem: I have a smart environment monitoring device at home, this has a few sensors, like a temperature sensor, a humidity sensor, a carbon monoxide sensor, noise level and air quality sensor, it is also connected to the internet and every 30 seconds, it sends this data to the server. In addition to environment data, it occasionally also sends the device status data every hour, which looks at the overall condition of the device.    


This device has become very popular and now has been in the market for some time, it has been sold to a million plus households. Yes, this means that we have a problem, a good problem statement, which requires us to process all this information at the server. 

This is a typical problem for a stream processing data system to handle.  This is a high velocity and high volume data which needs to be handled by our data processing system.

The Kappa architecture

An architectural pattern which handles a stream processing system is the kappa architecture, which treats streams as first class citizen, apache spark streaming is a framework which has a concept of micro batches and structured streaming to be able to process data from stream oriented systems.

In part two we will take a look at the kappa architecture in detail and come up with an initial design of a system that handles such a use case. 




Tuesday, June 23, 2020

Low Carb Diet

Following a new low carb diet, high fat, keto diet , apparently carbs especially simple carbs can make you fat and create high insulin levels in the body, which inhibit fat loss.

Following are the high level categories of foods, each category has foods that are allowed and foods that are strictly avoid

  1. Vegetables: Anything grown above the ground is generally considered a good low carb vegetable, the following are the most desirable
    1. Kale
    2. Broccoli raab
    3. Watercress
    4. Spinach
    5. Green leaf lettuce
    6. Celery
    7. Tomato
    8. Cucumber
    9. Bok choy
    10. White mushrooms
    11. Eggplant
    12. Asparagus
    13. Zucchini
    14. Bell peppers
    15. Cabbage
    16. Cauliflower
    17. Brocolli
    18. Fennel
    19. Green beans
  2. Fruits: Fruits with buttery fat and less sweetness are a good choice as they contain less carb
    1. Avacado
    2. Star fruit
    3. Blackberry
    4. Raspberry
    5. Honeydew melon
    6. Strawberry
    7. Cantaloupe
    8. Lemon
    9. Gooseberry
    10. Watermelon
    11. Peach
    12. Apricot
    13. Plum
    14. Blueberry
  3. Nuts: Again, nuts with good fats and fewer carbs
    1. Pecan
    2. Brazil Nut
    3. Macadamia
    4. Walnut
    5. Hazelnut
    6. Peanut
    7. Almond
  4. Dairy: High fat dairy products are good, avoid milk
    1. Butter
    2. Cottage Cheese
    3. Ghee
    4. Cheddar Cheese
  5. Fish, Meat and Poultry: Most are good,
    1. Fatty fish like, tuna, salmon, sardines
    2. Eggs
    3. Organic Chicken
Besides the above, special foods that need to be incorporated
  • Chia seeds
  • Flax Seeds
  • Seaweed snacks
  • Coconut Oil
  • Avacado Oil
  • Oat milk

Tuesday, July 17, 2018

Uses of a permissioned chain

There has been a lot of excitement around blockchain especially in the financial industry, below are some of the use cases where blockchain can be applied in the financial industry. We will go into the details of each of these at a later date.


  • Treasury Services
  • Trade finance application
  • Treasury Services
  • Cross border transaction
  • Asset management 
  • Post trade settlement


Thursday, December 3, 2015

Elastic Search by example Part 2

In the previous blog post we saw how to get up and running with elastic search, we now are going to define the index and load some data into it.

Defining the index

in our quest to build the classifieds application we will need a place to store the data. In elastic data is stored in an index, think of an index as a database a logical entity which will contain all the data.

An index consists of a Type, types can be thought of as tables in a database.


using kopf or head (described in part 1) we can define an index in our elastic instance


Wednesday, December 2, 2015

Elastic Search by Example Part I

Image result for elastic search


Elastic search has been gaining ground as yet another nosql solution. In this series we will look at elastic search by building an application and get to know some of its features.
This post is still a work in progress and will evolve over time as elastic co releases newer versions of elastic search.

Why Elastic search?

Applications which need to perform free form text search are ideal to use elastic as there backed store, I am however not so comfortable thinking of using elastic where there are lots of updates happening. Applications where there is a lot of analytics is also well suited for elastic.

Free text search

Before we get into the details of elastic search ,we need to understand what is the primary use of elastic, elastic is ideal if your application involves searching of free form text, examples of such applications include, legal document processing, text mining application, ad hoc search application.


Image result for lucene icon
Elastic search is built on top of lucene search library which is a powerful robust search library, but it's limited to use as a library, elastic allows you to scale your search soultion.


For the purposes of this series we will try to build a classifieds application which has search on items

A classified application

Lets try to build a classified ads example by, here the primary use case is to search for ads by using keywords.



Architecture of the application

above diagram illustrates at very high level the 3 layers of the application, the data layer is the most important to us as that's where elastic search comes into picture.



The grunt work
To get started, the obvious first step is to download elastic search from here, as of this writing the version available is 2.1

the distribution is packaged as a zip file elasticsearch-2.1.0.zip
unzip this to any location, the bin folder contains the scripts to start elastic search, I am going to use elasticsearch.bat as I am using windows

elasticsearch-2.1.0\bin>elasticsearch.bat

[2015-12-02 19:19:48,083][INFO ][node ] [Richard Rider] version[2.1.0], pid[6028], build[72cd1f1/2015-11-18T22:40:03Z]
[2015-12-02 19:19:48,084][INFO ][node                     ] [Richard Rider] initializing ...
[2015-12-02 19:19:48,251][INFO ][plugins                  ] [Richard Rider] loaded [], sites []
[2015-12-02 19:19:48,611][INFO ][env                      ] [Richard Rider] using [1] data paths, mounts [[Windows7_OS (C:)]], net usable_space [93.7gb], net total_s
pace [287.1gb], spins? [unknown], types [NTFS]
[2015-12-02 19:20:01,453][INFO ][node                     ] [Richard Rider] initialized
[2015-12-02 19:20:01,454][INFO ][node                     ] [Richard Rider] starting ...
[2015-12-02 19:20:02,257][INFO ][transport                ] [Richard Rider] publish_address {127.0.0.1:9300}, bound_addresses {127.0.0.1:9300}, {[::1]:9300}
[2015-12-02 19:20:02,297][INFO ][discovery                ] [Richard Rider] elasticsearch/CRv8mY8fTv--0lunNkta4Q
[2015-12-02 19:20:06,394][INFO ][cluster.service          ] [Richard Rider] new_master {Richard Rider}{CRv8mY8fTv--0lunNkta4Q}{127.0.0.1}{127.0.0.1:9300}, reason: ze
n-disco-join(elected_as_master, [0] joins received)
[2015-12-02 19:20:07,192][INFO ][gateway                  ] [Richard Rider] recovered [0] indices into cluster_state
[2015-12-02 19:20:07,310][INFO ][http                     ] [Richard Rider] publish_address {127.0.0.1:9200}, bound_addresses {127.0.0.1:9200}, {[::1]:9200}
[2015-12-02 19:20:07,311][INFO ][node                     ] [Richard Rider] started

the above logs show elastic search starting up, the bit highlighted in green is the name of the node, this name can be configured (more on this later), in this case "Richard Rider" is something that is assigned by elastic.

once the service has started, go to this url in the browser, you will see the below json response

{
  "name" : "Richard Rider",
  "cluster_name" : "elasticsearch",
  "version" : {
    "number" : "2.1.0",
    "build_hash" : "72cd1f1a3eee09505e036106146dc1949dc5dc87",
    "build_timestamp" : "2015-11-18T22:40:03Z",
    "build_snapshot" : false,
    "lucene_version" : "5.3.1"
  },
  "tagline" : "You Know, for Search"
}



Installing plugins
Elastic search has a plugin mechanism that allows you to add functionality to elastic search, tow such plugins are kopf and head which allows you to manage and visualize the cluster.

to install the plugins execute the below command from the bin folder

elasticsearch-2.1.0\bin>plugin install lmenezes/elasticsearch-kopf/2.0

elasticsearch-2.1.0\bin>plugin install mobz/elasticsearch-head


from your browser navigate to the following URLS

kopf: http://localhost:9200/_plugin/kopf



head: http://localhost:9200/_plugin/head




with this we conclude the first part of this series, in subsequent parts we will build out the application and get ourselves familiarized with elastic search.