I was working on a project which included the transformation of an excel data into a database that could be used for pivot charts & PowerBI. The spreadsheet was already in a format similar to the pivot table so I had to convert that into a database kind of format so that with data labels I was intending to use the slicer in the spreadsheet. So in order to get the transformation, I initially thought of using the HLOOKUP & VLOOKUP
I prepared another sheet that links with the original sheet but uses the data as referenced by the HLOOKUP & VLOOKUP formulas, but I soon discarded this as it was getting difficult to have a diverse implementation of this process.
Next, I searched the internet for a solution on such database indexing & many times, the INDEX function from Excel Appeared & finally I thought of giving it a try. But before that, I need to prepare the references that shall be used by the INDEX function to return the correct value at the correct position. So Now I had to prepare a sequence that could now act as references & so I decided to create something similar to the clock function used in electronic applications such as logical circuits (MUX, DEMUX, etc.).
So I introduced two arbitrary parameters which I called H-CLOCK & V-CLOCK, both these parameters would provide the referencing sequences. Now the problem was how to get such kind of sequence on excel, if you type in value & then drag it on other cells it figures a pattern & then somehow pastes that value on the cells. This could work but I wanted it to go up to a number & then resume from the starting number. Again, the internet comes to the rescue & I learned about a few new formulas which were ROWS, COLUMNS, INT & MOD. The combination of these formulas helped me in getting the sequence that too in a dynamic format if suppose the no. of rows or columns in the original main input sheet changed then the sequence also changed accordingly. I also used the COUNTA formula to regulate the sequence.
Here's Video on how you can create sequences in excel
The V-CLOCK Values control the areas that are fixed & already in format while the H-CLOCK values control the items that are to be unpivoted in this case the Date, Week Number & Quantity. The sequences in both columns are different as you can see from the image below the values in the V-CLOCK start from 3 & end at 15 reiterating multiple times until the sequence is completed while the values in H-CLOCK remain constant until the next iteration commences in the V-CLOCK.
Now using the INDEX function, we can get values of the different cells in the spreadsheet. For the column containing the quantities, I had to use the MATCH function in the INDEX function to give the references to the rows & columns in order to get the values.
All this was about the code but there's a small error! When you drag this coded rows to apply formulas to the lower cells it gets the output as #VALUE once the sufficient references aren't available. As a solution, I had to use IF function with a check that if the value is not blank then it'll display blank else it'll execute the remaining command. Another big obstacle was that sometimes the values in the column for Quantity contains text or other characters so it was essential that I get rid of those things & so I used another IF statement with the ISNUMBER function with an argument that IF(ISNUMBER(cell Ref)=FALSE,(keep it bank),(execute remaining command)) so in this way any characters other than numeric values were filtered out.
In this way, I could apply the formulas & stuff I knew about excel & spreadsheets. I'm certain that this is the long way to get the task done & there might be much easier methods to get this task done. I later discovered that PowerBI already has an UNPIVOT option.
Thanks for reading...
Here the capacity Glue gives the main activity, we are keen on : it completes the expansion of two records containing given rooms.ExcelR Data Science Courses
ReplyDelete