Learn DataBricks PySpark & SparkSQL with our Notebook

 


In this post, we will share our Databricks Learning Notebook that contains syntax and best practices for PySpark Dataframes API and Spark SQL. 


   This notebook will teach you basics and how to handle certain Data Analyst tasks like Missing and Duplicate data, column transformations etc.

   Even though this notebook is created for Data Analyst use cases, anyone who uses data transformations on DataBricks will benefit from it.


   You can download the notebook via link below. Zipped file contains  DBC, HTML and Python File format of the same notebook. You can import  DBC file to your workspace to see the notebook. 

Click here to Download the Notebook


We divided Databricks Learning Notebook into 3 main categories : 

  1. BASIC DEFINITIONS & LINKS
  2. DATAFRAMES
  3. SPARK SQL

   You can navigate between sections, using the side navigator. You can also collapse the headers, to avoid unused sections taking space.



















   We also included useful links and courses to learn more about PySpark and Spark SQL.  Keep in mind that some of the courses requires membership.

















   To avoid using lots of cells, we combined similar code snippets in the same cell and added an empty cell below to run a particular command.  You can copy the command you want to run (including any dependent code snippets, import statements etc. )  and paste it to the empty cell below and run. 
























If you liked this post, please don't forget to share and leave a comment below! 




Share:

Use Our Quote Library to Enrich your Presentations

  


In this post, we will share our Quotation Library that we gathered from all around the Internet.

  We also labeled each quotation to related categories. So, every time you need a quotation from that subject, you can easily filter the library and find the suitable one for your need.

These are the categories we have for total of 727 quotes :

  • Attitude
  • Best Quotes of All Time
  • Brainy
  • Business
  • Change
  • Communication
  • Inspirational Quotes
  • Movies
  • Quotes About Life
  • Road to Success Quotes
  • Teamwork







You can download this quote library via link below.  (You can use slicers if you open with Excel)




If you liked this post, please don't forget to share and leave a comment below! 

Share:

Get the Latest News with our Knime Workflow

 

 In this post, we will take a look at our Knime Project where we can get the latest news from major media outlets using their RSS feeds and combining them with a user friendly UI.





Click here to Download this project from Official Knime Page



First of all, we connect to media outlet's RSS Feeds by using RSS Feed Reader node.

Then we concatenate all the results in a single table and pass it to dashboard component. 







Inside the dashboard component, we first show the results as a table with clickable hyperlink to each news detail. We also provide media outlet as a slicer for the user.





   Second part of the dashboard component is for creating the word cloud that contains the most common words from the news. In this step, we first pre process the data and then create bag of words from the news.  Then we calculate the term frequencies to show them in the word cloud visual.


Here is the end result look like : 


























With just one click, you can get the breaking news ready and read them from Knime Browser. 


If you like this project, please share and comment below. Also don't forget to check our Knime segment for more projects like this. 





Share:

Scrape your Competitor's Websites with Advanced Web Scraper







 In this post, we will go over the details of our latest project Advanced Web Scraper for H&M Germany Knime Workflow.


   This workflow, connects to the H&M Germany website and gets the product category and sub category information, as well as all the product page URLs and price information to aggregate at certain hierarchies. 

   This template can be used for other retailers. However, since the design of the website will be different than this, there has to be some changes required. 









   For each category in H&M website, we will replicate the steps. However, when website gets UI updates, there might be some code changes required as well. 















   First, we will use Webpage Retriever and Xpath nodes to connect to the website and get the product categories.  Then, we will do some transformation to get the price and format the data for our reports.






   We will also calculate the product count at each prive level per category to see how are the prices distributed per categories.  After these transformations, data will be ready to sent to Power BI for further reporting.






By using metanodes, we can wrap and run all of these with just one click and the data will be ready for our PowerBI dataset. 







We can also export results to Excel like below. 


















If you liked this project, please don't forget to share and leave a comment below!





Share:

One Click to get Weather Forecast of your favorite cities


 In this post, we'll go over the Knime Workflow that gets the weather forecast information for the cities you've selected.


 


(This workflow is available for download at BI-FI Business Projects Knime Hub page.)

Click here to visit the download page


   First step in this workflow is to connect to the weather forecast website and extract the details we are interested in. 

(https://www.weather-forecast.com/countries)

We will use Webpage Retriever and XPath nodes in Knime to achieve this.

Knime Workflow







   Then in the STEP 1 Component, we will let user select the cities they wish by using Nominal Row Filter Widget.  City list includes all the major cities all around the world.








   After user is done selecting the cities, we will do the transformation required in the STEP 3 metanode and feed the final data to dashboard to show the forecast for selected cities. 




Just click the Execute All button in the Knime and let the workflow do its magic. 


   For more workflows like this, check out the Knime section on our website and also our Knime Hub Profile.

  BI FI Business Knime Hub Profile








Share:

Popular Posts