Wednesday, 30 November 2016

REST API WITH POSTGRES DATABASE

      REST API WITH POSTGRES DATABASE



  1. First i started working on eclipse mars ,but i got lot of errors even at sample program that given in https://www.tutorialspoint.com/restful/restful_first_application.htm. Main drop back of eclipse is we need to add many jersey files and jdk 1.7 , tomcat into application
  2. Then i move on to netbeans because i contain in bulid jersey files, jdk , tomcat.
Netbeans setup Environment


Give Executable permission :  $ chmod +x ~/Downloads/netbeans-8.1*.sh

Start Installer : $ cd ~/Downloads &&   ./netbeans-8.1*.sh

Set JDK location      : Browse to JDK location
-----------------------------------------------------------------------
Tomcat server setup Environment

  http://tomcat.apache.org/download-80.cgi => Find Full Documentation => .tar.gz => Download
$ tar –zxvf apache-tomcat-8.5.5.tar.gz
$ cd apache-tomcat-8.5.5/bin/
To start tomcat server:
$ ./startup.sh
Browse Tomcat running or not:
Localhost:8080
To stop tomcat server:
$ ./shutdown.sh
To change tomcat users name and password :
$ gedit /path/to/apache-tomcat-8.5.5/conf/tomcat-users.xml
To change server port number:
$ gedit /path/to/apache-tomcat-8.5.5/conf/server.xml
------------------------------------------------------------------------------------------------------------------------
Create Rest Api application
  File=>New Project=>Java Web=>Web Application=>Next => Enter Project Name => next=>Select Server (Either GlassFish/Tomcat) => next=>Finish.






-----------------------------------------------------------------------
  

Postgres setup Environment
         
1. sudo apt-get update
2. sudo apt-get install postgresql postgresql-contrib
3. Create a user account called postgres which is associated to postgres role
       sudo -i -u postgres
4. Postgre is prompt by immediately  typing
       psql
5. For Exit Postgres type
       \q
6. For enter into Postgres use this keyword
       sudo -u postgres psql postgres
7. For creating database password
       \password postgres
8. Create database in Postgres
       create database databasename
9. For connecting database
         \c databasename
10. For disconnecting database
          \q


----------------------------------------------------------------------------------------------------






SETTING PASSWORD FOR POSTGRES:

supranimbus04@supranimbus04-Inspiron-3250:~$ sudo -u postgres psql
Password:


It will ask password , If the configuration  of file                                                           $ sudo gedit /etc/postgresql/9.3/main/pg_hba.conf



local   all             postgres                                md5

If we don’t need to password , then change configuration of the
$ sudo gedit/etc/postgresql/9.3/main/pg_hba.conf
local   all             postgres                                peer


If any changes made into $ sudo gedit/etc/postgresql/9.3/main/pg_hba.conf file , Then you need to restart to the service


sudo service postgresql restart

--------------------------------------------------------------------------------------------------------


To view postgres status
       
     $ sudo service postgresql status
--------------------------------------------------------------------------------------------------------



Creating table in postgres

  1. Create table tablename ();
  postgres=# create table testtbl(id int primary key not null,name  char(30));
  
    2. To view list of tables
      
  • \d - is to view list of new tables that are created
  • \dt - is to view list of tables that are present in database

   3. To delete table
         
        Drop table tablename;


   ------------------------------------------------------------------------------------------------------

For JDBC connection
 
  1. Now add this driver into netbeans libraries
  2. For reference visit this link
      http://java2db.com/java-integrations/jdbc-with-postgresql-using-myeclipse-and-netbeans
 
--------------------------------------------------------------------------------------------------------

Employee details program using Rest api with postgres database


 Access.java

Screenshot from 2016-10-21 09:16:44.png

DataBase.java

Screenshot from 2016-10-21 09:18:08.png

Employee.java

Screenshot from 2016-10-21 09:22:28.png
AccessManager.java

Screenshot from 2016-10-21 09:23:46.png



EmployeeSevice.java

Screenshot from 2016-10-21 09:26:02.png

Output
         JSON file will be displayed under this URL
Screenshot from 2016-10-21 09:29:53.png
















SINGLE VIEW DEMO USING ZEPPELIN


   Single View Demo Using Zeppelin


It help retailer to get a single view of a product by combining data from different sources and run interactive queries against the data.
In this case we collect from three different source
  • ETL/CRM data from EDW/DB - ingested using Apache Sqoop
  • Web traffic data - ingested using HDF
  • Social media data - ingested using HDF
Steps:
  1. Pre-reqs: Setup and import retail data set into DB/EDW (in this case: Postgres)
  2. Use Sqoop to import CRM/ERP data from DB/EDW into Hive
  3. Use HDF to import related tweets into Hive
  4. Use HDF to import simulated web traffic logs into Hive
  5. Use Hive to analyze tables to populate statistics
  6. Use Hive to correlate the data from multiple data sources

1.Prerequisite:
  • First give zeppelin to sudo privilege
echo "zeppelin  ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers
  • Second enable zeppelin to log in to postgres
echo "host all all 127.0.0.1/32 md5" >> /var/lib/pgsql/data/pg_hba.conf

  • Create zeppelin user in Postgres
  • Setup Sqoop for Postgres
sudo wget https://jdbc.postgresql.org/download/postgresql-9.4.1207.jar -P /usr/hdp/current/sqoop-client/lib ls -la /usr/hdp/current/sqoop-client/lib/postgres*
  • Pull latest data from github
ls single-view-demo
  • Download retail data set
cd /tmp
unzip data.zip
head /tmp/data/Fact Sales.csv
  • Import data into Postgres
head -n 16 ~/single-view-demo/contoso-psql.sql
export PGPASSWORD=postgre password
psql -U postgres username -d databasename -h localhost -f ~/single-view-demo/contoso-psql.sql
psql -U postgres username -d databasename -h localhost -c "\dt"

2.EDW Optimization

  • Check access to Postgres tables from sqoop
sqoop list-tables --connect jdbc:postgresql://localhost:5432/contoso --username postgres username --password postgres password -- schema postgres
  • Make sure hive is started
  • Load data from psql into hive
sqoop import-all-tables --username postgres username --password postgres password --connect jdbc:postgresql://localhost:5432/database name  --hive-import  --direct
  • Create final table in ORC format and populate that final table from staging
  • Incremental import of data into Hive from RDBMS

Now that we did the one time bulk import, next we will setup an incremental sqoop job.First we create password file containing zeppelin user's PostGres password in HDFS. This is done to allow invocations of the job to be automated/scheduled (without having to manually pass the password )
echo -n "postgres username" > .password

hadoop fs -put .password /user/postgres username/

rm .password

  • Create incremental import sqoop job for final table
  1. –table: table the job is for (i.e. final table)
  2. –password-file: the HDFS location of the password file
  3. –incremental: lastmodified (we want to use lastmodified logic to find delta records)
  4. –check-column: specify which column that will be used to determine which delta records will be picked up
  • In Hive, move data from staging table to final table
  1. first remove the common records from final table that are also found in staging table
  2. copy data from staging table to final table
  3. truncate staging table

3. Import related tweets into Hive
  • To install Nifi, start the 'Install Wizard': Open Ambari (http://sandbox.hortonworks.com:8080) then:
  • On bottom left -> Actions -> Add service -> check NiFi server -> Next -> Next -> Change any config you like (e.g. install dir, port, setup_prebuilt or values in nifi.properties) -> Next -> Deploy. This will kick off the install which will run for 5-10min.
  • Now open Nifi webui (http://sandbox.hortonworks.com:9090)
  • Download pre built Single_view_demo.xml template to your laptop from https://raw.githubusercontent.com/abajwa-hw/single-view-demo/master/template/Single_view_demo.xml
  • Import the Single_view_demo.xml template info Nifi
  • Fill out env specific configs e.g. your Twitter credentials
  • Do not start the flows yet
  • Twitter requires you to have an account and obtain developer keys by registering an “app”. Follow steps below to do this:
  1. Create a Twitter account and app and get your consumer key/token and access keys/tokens:
  2. Open https://apps.twitter.com
  3. sign in
  4. create new app
  5. fill anything
  6. create access tokens
  • Create script to Generate weblog. Copy to a dir accessible by nifi

cp /home/zeppelin/single-view-demo/createlog-psql.sh /tmp
cp /home/zeppelin/single-view-demo/DimCustomer.csv /tmp
echo "/tmp/createlog-psql.sh /tmp/data/DimCustomer.csv 20 >> /tmp/webtraffic.log" >> /tmp/generate- weblog.sh
chmod 777 /tmp/createlog-psql.sh
chmod 777 /tmp/generate-weblog.sh
chmod 444 /tmp/DimCustomer.csv
  • Now start the flows in Nifi using the play button to:
  1. simulate and ingest the tweets under /tmp/weblog_staging in HDFS
  2. ingest the weblogs under /tmp/tweets_staging in HDFS
  • Once data has started to flow into the tweets and weblog HDFS locations, lets:

  1. update permissions on the HDFS dirs
  2. create hive tables for tweets and weblog
  3. check that hive tables contain data

4.Analyze tables/columns

  • Now let's run table level statistics on the tables we are interested in to improve query times
    
            analyze table tweets compute statistics
analyze table weblog compute statistics
analyze table final table statistics

  • Next let's compute column statistics for the same tables
analyze table tweets compute statistics for columns
analyze table weblog compute statistics for columns
analyze table final table compute statistics for columns

5.Use Hive to correlate the data from different sources

  • Query for top most visited product:

select product,count(1) popular from weblog group by product order by popular desc limit 10

  • Query for Top products with most time spent by year

select product,year_visited, sum(time_spent) as total_time from(select table1.product,year(visited) as year_visited, month(visited) as month,unix_timestamp(table1.lead_window_0) -unix_timestamp(table1.visited) as time_spent from (select product,visited,lead(visited)  over (PARTITION BY session ORDER BY visited asc) from weblog) table1
where table1.lead_window_0 is not NULL) table2 group by product,year_visited
order by total_time desc limit 100

  • Query for most popular web_path

select web_path,count(1) as path_count from (select session,concat_ws("->",collect_list(product)) as web_path from weblog group by session) table1 group by web_path order by path_count desc


6.Reference


7.Error faced

  • Resource manager starting error:
While starting virtualbox. It just stuck in loading resource manager that time just login into root and run these below command

/usr/lib/hue/tools/start_scripts/start_ambari.sh
  • Zeppelin server starting error:
Under Ambari while starting zeppelin service it unable to start the service that time run below command
 #   BAD_FILE=`find /usr/hdp/current/zeppelin-server/lib/notebook -name "*.json" -size 0` ; mv `dirname $BAD_FILE` /tmp/
         Or


  • NullPointerException in hive table at zeppelin :
If your query is "%hive select count(*)" from table you will see the query being sent look like "elect count(*) from table". This is because the parser looks for a "(prefix_name)" and here mistakes "(*)" for a prefix. That time use %hive(default) instead of %hive
Or


  • For add hive user and hive password in interpreter

  • First create unix user
           ssh root@server_ip_address(ex: ssh root@127.0.0.1 -p 2222)
           adduser username
passwd username

  • Then create ambari user with same user name

  • Then at %hive properties
hive.user= username
hive.password=password

  • Then click save button

  • Now hive user is added in hive interpreter