Let’s Git it Going
My First Practice Project uploaded to Github
I had never in my wildest dream ever imagined that one day I will be using Git. I only hear Git and source control from the lips of my developer friends. Now here I am uploading practice project to test the waters. Anyway, there is always a first time for everything and the first time.
For this practice project, I had decided to create a bar chart in Colab and uploaded it to my Git repository. You may click it here. My project is really simple I just want to graph the total revenue of each product sold. I have a sales data set that have some order id, product, price each,quantity, date, and address. I actually followed Keith Galli in Youtube to make this project. Followed his examples and figured out some stuff using google, stackoverflow, google and some documentations. I highly recommend his channel if you wanna make some projects just for practice like I did.
I will walk you through the different parts of the project I made and some important things I have learned thus far. At the end, I will show how I uploaded the project from Colab to Github.
Creating the Chart
Importing the Dataset
If you want to import a file from local to your Colab project you need to use the files.upload() function of Colab. This is different from the usual direct approach when using Jupyter notebook. Maybe later on I will create a list of things I find different from using Jupyter notebook and Colab after having several practice with both.
Reading the file
In the previous step we have imported the CSV file, now we need to read the file and check what is on the file. In the order date column, the top value is “Order Date”. This is odd, I suspect some header lines were repeated in this file. We will need to remove them. It is important to check your data either by checking your CSV file or other methods available in python like the describe() function. Learned it the hard way when I first did this project and I kept getting all weird errors as I tried doing some computation.
Dropping Rows
We can now see that we had dropped the repeated header lines. We can now proceed in checking and correcting some data types in the files.
Checking and Changing Data Types
In the image above you can see that in the first inspection the data types were all “objects”. This means that python is treating all the values as object data type. We cannot use this data for some computation as objects are not numbers. We will then convert some data type that we intend to use for the project into a floating number. We choose a floating number because we want to account the decimal numbers.
Adding a Computed Column
As I wanted to graph revenue and the data I was given does not have the field for the total amount paid in the order line, so I had decided to compute the total amount per order line item. The formula is straight forward we multiply quantity ordered to price each.
I had some regrets in choosing my quantity ordered data type now that I am writing this blog. I had chosen float when I should have used int. Why? I just realized that the products are mostly electronics, no one will sell a 1.56 unit of phone.
Checking if Matplotlib is Installed in Colab
I had decided to install the package I want to use in Colab. Luckily its already there.
Preparing my Data for the Chart
In the image above, you can see that I had grouped the Products by the sum of their Amounts. Then sorted the value in descending order. And Yes, it is just one line doing all of that. I learned all of this from the youtuber I mentioned above.
What is returned from this line is still something I cannot graph. Price is good as my x-axis for now. I still don’t have an y-axis. To do this I will make the products as my y-axis. The products are actually the index of my prices table, so I decided to list them as seen below.
Creating Vertical Bar Graph
As you can see in this graph, we can see that Macbook Pro Laptop brings in the most revenue for this company.
I realize how long the process of cleaning and pre-processing data just to make this very simple graph. Now I have new-found respect for data cleaning. It is very important to have a clean data as you can actually quite end up with charts that does not make sense, giving wrong values, or worst not even give you a chart just all errors!.
Committing your Project to Github
Download your Colab project as Ipython file
To download your Colab file, go to File > Download .ipynb.
Create a New Repository in Github
Login to your Github account. Then, go to Repositories and click New.
Copy the path of your new repository. Please don’t mind the name of the repository in the image, I wanted to show how it looks like if its newly created repository.
Cloning Repository to your Laptop
Open command prompt and go to the folder you want to clone your repository. Enter “git clone (path of your repository)”.
Copy your project to your local repository
Go to the folder where you had cloned your repository and copy your downloaded file from Colab.
Uploading your Project to Github
To upload your project to github, open command prompt and enter “git add .”. Yes, you need to add space after the word add.
Then, enter “git status”. This will check the the files you have in your repository.
We will then need to configure the credentials in your repository for github. Enter “git config --global user.email “(your github email here)””.
We will now commit your project to github. Enter “git commit -m “rush commit””.
After committing we will now push the project to github. Enter “git push -u origin master”. You will be prompted to enter your git credentials. Just enter your git email and password.
If you are successful in uploading your project to github, you should be able to see your file on your github repository
Hopefully you find this long blog helpful. Until the next blog then, see you!.