pulpcode

Wednesday, July 1, 2020

Streaming data processing systems - part 1

Problem: I have a smart environment monitoring device at home, this has a few sensors, like a temperature sensor, a humidity sensor, a carbon monoxide sensor, noise level and air quality sensor, it is also connected to the internet and every 30 seconds, it sends this data to the server. In addition to environment data, it occasionally also sends the device status data every hour, which looks at the overall condition of the device.

This device has become very popular and now has been in the market for some time, it has been sold to a million plus households. Yes, this means that we have a problem, a good problem statement, which requires us to process all this information at the server.

This is a typical problem for a stream processing data system to handle. This is a high velocity and high volume data which needs to be handled by our data processing system.

The Kappa architecture

An architectural pattern which handles a stream processing system is the kappa architecture, which treats streams as first class citizen, apache spark streaming is a framework which has a concept of micro batches and structured streaming to be able to process data from stream oriented systems.

In part two we will take a look at the kappa architecture in detail and come up with an initial design of a system that handles such a use case.

Tuesday, June 23, 2020

Low Carb Diet

Following a new low carb diet, high fat, keto diet , apparently carbs especially simple carbs can make you fat and create high insulin levels in the body, which inhibit fat loss.

Following are the high level categories of foods, each category has foods that are allowed and foods that are strictly avoid

Vegetables: Anything grown above the ground is generally considered a good low carb vegetable, the following are the most desirable

Kale
Broccoli raab
Watercress
Spinach
Green leaf lettuce
Celery
Tomato
Cucumber
Bok choy
White mushrooms
Eggplant
Asparagus
Zucchini
Bell peppers
Cabbage
Cauliflower
Brocolli
Fennel
Green beans

Fruits: Fruits with buttery fat and less sweetness are a good choice as they contain less carb

Avacado
Star fruit
Blackberry
Raspberry
Honeydew melon
Strawberry
Cantaloupe
Lemon
Gooseberry
Watermelon
Peach
Apricot
Plum
Blueberry

Nuts: Again, nuts with good fats and fewer carbs

Pecan
Brazil Nut
Macadamia
Walnut
Hazelnut
Peanut
Almond

Dairy: High fat dairy products are good, avoid milk

Butter
Cottage Cheese
Ghee
Cheddar Cheese

Fish, Meat and Poultry: Most are good,

Fatty fish like, tuna, salmon, sardines
Eggs
Organic Chicken

Besides the above, special foods that need to be incorporated

Chia seeds
Flax Seeds
Seaweed snacks
Coconut Oil
Avacado Oil
Oat milk

Tuesday, July 17, 2018

Uses of a permissioned chain

There has been a lot of excitement around blockchain especially in the financial industry, below are some of the use cases where blockchain can be applied in the financial industry. We will go into the details of each of these at a later date.

Treasury Services
Trade finance application
Treasury Services
Cross border transaction
Asset management
Post trade settlement

Thursday, December 3, 2015

Elastic Search by example Part 2

In the previous blog post we saw how to get up and running with elastic search, we now are going to define the index and load some data into it.

Defining the index

in our quest to build the classifieds application we will need a place to store the data. In elastic data is stored in an index, think of an index as a database a logical entity which will contain all the data.

An index consists of a Type, types can be thought of as tables in a database.

using kopf or head (described in part 1) we can define an index in our elastic instance

Wednesday, December 2, 2015

Elastic Search by Example Part I

Elastic search has been gaining ground as yet another nosql solution. In this series we will look at elastic search by building an application and get to know some of its features.
This post is still a work in progress and will evolve over time as elastic co releases newer versions of elastic search.

Why Elastic search?

Applications which need to perform free form text search are ideal to use elastic as there backed store, I am however not so comfortable thinking of using elastic where there are lots of updates happening. Applications where there is a lot of analytics is also well suited for elastic.

Free text search

Before we get into the details of elastic search ,we need to understand what is the primary use of elastic, elastic is ideal if your application involves searching of free form text, examples of such applications include, legal document processing, text mining application, ad hoc search application.

Elastic search is built on top of lucene search library which is a powerful robust search library, but it's limited to use as a library, elastic allows you to scale your search soultion.

For the purposes of this series we will try to build a classifieds application which has search on items

A classified application

Lets try to build a classified ads example by, here the primary use case is to search for ads by using keywords.

Architecture of the application

above diagram illustrates at very high level the 3 layers of the application, the data layer is the most important to us as that's where elastic search comes into picture.

The grunt work
To get started, the obvious first step is to download elastic search from here, as of this writing the version available is 2.1

the distribution is packaged as a zip file elasticsearch-2.1.0.zip
unzip this to any location, the bin folder contains the scripts to start elastic search, I am going to use elasticsearch.bat as I am using windows

elasticsearch-2.1.0\bin>elasticsearch.bat

[2015-12-02 19:19:48,083][INFO ][node ] [Richard Rider] version[2.1.0], pid[6028], build[72cd1f1/2015-11-18T22:40:03Z]
[2015-12-02 19:19:48,084][INFO ][node ] [Richard Rider] initializing ...
[2015-12-02 19:19:48,251][INFO ][plugins ] [Richard Rider] loaded [], sites []
[2015-12-02 19:19:48,611][INFO ][env ] [Richard Rider] using [1] data paths, mounts [[Windows7_OS (C:)]], net usable_space [93.7gb], net total_s
pace [287.1gb], spins? [unknown], types [NTFS]
[2015-12-02 19:20:01,453][INFO ][node ] [Richard Rider] initialized
[2015-12-02 19:20:01,454][INFO ][node ] [Richard Rider] starting ...
[2015-12-02 19:20:02,257][INFO ][transport ] [Richard Rider] publish_address {127.0.0.1:9300}, bound_addresses {127.0.0.1:9300}, {[::1]:9300}
[2015-12-02 19:20:02,297][INFO ][discovery ] [Richard Rider] elasticsearch/CRv8mY8fTv--0lunNkta4Q
[2015-12-02 19:20:06,394][INFO ][cluster.service ] [Richard Rider] new_master {Richard Rider}{CRv8mY8fTv--0lunNkta4Q}{127.0.0.1}{127.0.0.1:9300}, reason: ze
n-disco-join(elected_as_master, [0] joins received)
[2015-12-02 19:20:07,192][INFO ][gateway ] [Richard Rider] recovered [0] indices into cluster_state
[2015-12-02 19:20:07,310][INFO ][http ] [Richard Rider] publish_address {127.0.0.1:9200}, bound_addresses {127.0.0.1:9200}, {[::1]:9200}
[2015-12-02 19:20:07,311][INFO ][node ] [Richard Rider] started

the above logs show elastic search starting up, the bit highlighted in green is the name of the node, this name can be configured (more on this later), in this case "Richard Rider" is something that is assigned by elastic.

once the service has started, go to this url in the browser, you will see the below json response

{
  "name" : "Richard Rider",
  "cluster_name" : "elasticsearch",
  "version" : {
    "number" : "2.1.0",
    "build_hash" : "72cd1f1a3eee09505e036106146dc1949dc5dc87",
    "build_timestamp" : "2015-11-18T22:40:03Z",
    "build_snapshot" : false,
    "lucene_version" : "5.3.1"
  },
  "tagline" : "You Know, for Search"
}

Installing plugins
Elastic search has a plugin mechanism that allows you to add functionality to elastic search, tow such plugins are kopf and head which allows you to manage and visualize the cluster.

to install the plugins execute the below command from the bin folder

elasticsearch-2.1.0\bin>plugin install lmenezes/elasticsearch-kopf/2.0

elasticsearch-2.1.0\bin>plugin install mobz/elasticsearch-head

from your browser navigate to the following URLS

kopf: http://localhost:9200/_plugin/kopf

head: http://localhost:9200/_plugin/head

with this we conclude the first part of this series, in subsequent parts we will build out the application and get ourselves familiarized with elastic search.

Monday, June 8, 2009

Useful Java links

Learning Eclipse
Java tools
MIT OCW XML with java, java servlet and jsps