Expanding on IFC Metadata Extraction in PowerBI

Digital, How to guides By Ryan Lenihan May 16, 2024 No Comments

In a previous post, “Extracting IFC Metadata Directly in Power BI Using Power Query,” we explored how to use Power Query in PowerBI to directly extract metadata from IFC files using a Power Query shared by Shift Construction. If you haven’t read that post, I recommend checking it out first for context.

A few people reached out as they found the instructions provided to import the IFC file by Shift were a little confusing, especially for the new and novice PowerBI users. So in this follow-up, we’ll look at some improvements and modifications to the original Power Query code to streamline the process even further

Create Table Directly From IFC File

We can resolve the confusion with importing the IFC with a slight modification to the code. His modification allows you to import your IFC data in one step on a model-by-model basis. The change is made in the first few lines like so:

let
    Source = (filePath as text) => 
    let
        // Initial Import using File.Contents to read file from a path
        BinaryContent = File.Contents(filePath),
        Source = Table.FromColumns({Lines.FromBinary(BinaryContent, null, null, 1252)}),
        // The remaining steps follow on from here...Code language: JavaScript (javascript)

This small change allows you to generate a table directly from the IFC file. Once this change is made, the steps to import the IFC are:

Find your IFC in file explorer
Right-click, and select “Copy as path” (Windows 11)
Open PowerBI to the Transform Data window
Find the IFC4PowerBI function
Paste the path, removing the quotes
Click Invoke

After following those steps, the function will generate a new table named Invoked Function. You can rename the new table and use the in your data visualisations. Not only that, but you can use the function as many times as you want to import IFC data for as many files as you like, the only limitation being your computer’s hardware.

Optimise Performance with Buffering

If you have already played around with the original Power Query, you may notice your CPU fan starts spinning as the data is processing and the initial load can take some time. You may have also noticed that the status bar at the bottom right of PowerBI shows the amount of data read is significantly more than the size of your original IFC file.

This is because in the original query, the dataset created by ExtractBeforeCloseParenthesis is used multiple times throughout the query. Each time this data is called upon, it is written to disk as new data, and practically the entire IFC file is contained within this dataset. The only thing missing a few lines that contain null values in the initial extraction.

You can resolve this by modifying the original query to include Table.Buffer( on ExtractBeforeCloseParenthesis. The modified section of the query should look like this:

// Extract Data Before Close Parenthesis
      ExtractBeforeCloseParenthesis = Table.Buffer(Table.TransformColumns(
        SplitByOpenParenthesis, 
        {{"Values", each Text.BeforeDelimiter(_, ")", {0, RelativePosition.FromEnd}), type text}}
      )), Code language: JavaScript (javascript)

This minor change means that ExtractBeforeCloseParenthesis will be loaded to memory once, and once loaded, it is isolated from external changes during evaluation and will be referenced throughout the remainder of the query. Using a 38mb IFC file as an example, this change reduces the data loaded when processing the query from ~150mb to 39mb.

You can implement Table.Buffer elsewhere if you feel it’s warranted, but be mindful about how much data you’re buffering and that any data buffered will not take on changes. Some scenarios where Table.Buffer may be beneficial:

Multiple References

When your table or list will be referenced multiple times within your query (e.g., in a List.Generate function), using Table.Buffer can improve performance¹.
By buffering the table, you avoid recalculating it each time it’s referenced, which can be especially helpful for large datasets.

Recursive Functions

If you’re working with recursive functions, consider using Table.Buffer.
Recursive functions often re-evaluate the same data repeatedly. Buffering the table prevents redundant recalculations and speeds up the process.

Before Merging

Before merging tables, apply Table.Buffer to the target table.
Merging involves comparing rows between tables. Buffering the target table ensures that it’s loaded into memory, reducing the need for repeated scans.

Passing Tables to Functions

When passing tables to custom functions, consider buffering them.
If your function processes the same table multiple times, buffering can enhance performance.

Buffering consumes memory, so use it wisely. If you’re referencing a table only once, there’s no advantage to buffering it. Additionally, the impact of buffering depends on various factors, so always test your queries to find the optimal approach².

Extracting Object Names

When a colleague attempted to use the original query to run metadata checks, they found it had limited usefulness because they had certain metadata values that were tied to specific object name prefixes. The original query only extracts the property sets and none of the “element specific” data.

The way that I resolved this once the full table was generated, to take the resulting Object ID values and extract the individual line items from the IFC for that object, then finally splitting out the required object name and type information.

The information that appears as “element specific” in BIM Vision is found in the line related to the object itself, for example:

#8366= IFCCOLUMN('GUID Value',#41,'Object Name Value',$,'Object Type Name value',#8364,#8357,'Object Instance Identifier Value');Code language: PHP (php)

My approach to extracting this information was to go back to the SplitByEquals data, extract the matching lines, split the lines by the comma delimiter, then create new columns within the table to reflect the correct data.

I’ll be the first to admit I’m no PowerBI whiz, so there may be a more efficient way to approach this, but for now, adding the following code to the query after FinalSplit seems to work well and without too much of a performance hit.

        // Renaming columns to avoid conflicts
        RenamedSplitByEquals = Table.RenameColumns(
            SplitByEquals,
            {
                {"Element ID", "Element ID Renamed"}, 
                {"Values", "Values Renamed"} // Example, rename as needed
            }
        ),
        // Add columns for "Object Name" and "Object Type Name", and remove quotes
        AddedObjectColumns = Table.AddColumn(
            RenamedSplitByEquals,
            "Object Name",
            each let
                originalText = try Text.Split([Values Renamed], ","){2} otherwise null,
                cleanedText = Text.Replace(Text.Replace(originalText, "'", ""), """", "")
            in
                cleanedText,
            type text
        ),
        AddedObjectTypeColumns = Table.AddColumn(
            AddedObjectColumns,
            "Object Type Name",
            each let
                originalText = try Text.Split([Values Renamed], ","){4} otherwise null,
                cleanedText = Text.Replace(Text.Replace(originalText, "'", ""), """", "")
            in
                cleanedText,
            type text
        ),
        // Performing the join
        JoinedData = Table.Join(
            FinalSplit, 
            "Object ID", 
            AddedObjectTypeColumns, 
            "Element ID Renamed", 
            JoinKind.Inner
        ),
        // Remove the now unnecessary columns
        FinalCleanedTable = Table.RemoveColumns(
            JoinedData,
            {"Element ID Renamed", "Values Renamed"}
        )
    in
        FinalCleanedTableCode language: JavaScript (javascript)

The reason why I generated these columns right at the end as I found it was significantly more efficient to have the full set of Object ID values and use a JoinKind.Inner to pull the data together. When I attempted to process the data mid-way through the query with a Table.SelectRows there was a significant performance hit. Although maybe all we needed was a buffer!

Once this last modification is made to the code, the dataset can be filtered by object name to gather more targeted insights and data validation.

With these modifications to the original Power Query code – using TableFromColumns to generate the complete table in one step, and creating measures for key metrics – extracting and working with IFC metadata in PowerBI is now even more efficient and powerful.

As always, the full updated code is available on Github – feel free to use and adapt it for your own projects. If you have any other suggestions for working with IFC in PowerBI, I’d love to hear them.

automation, process

Expanding on IFC Metadata Extraction in PowerBI

Create Table Directly From IFC File

Optimise Performance with Buffering

Extracting Object Names

Related

Ryan Lenihan

Lean Management in Design and Construction

Improving Engineering Workflows with Lean Management

No Comments

Leave a comment Cancel reply

Expanding on IFC Metadata Extraction in PowerBI

Create Table Directly From IFC File

Optimise Performance with Buffering

Extracting Object Names

Related

Ryan Lenihan

Lean Management in Design and Construction

Improving Engineering Workflows with Lean Management

You may also like

Revit Cloud Upgrades Part 5 – Making ACC Revit Upgrade Reports Usable

Photos Aren’t the Problem. Memory Is.

Revit Cloud Upgrades Part 4 – The Test Upgrade Report

No Comments

Leave a comment Cancel reply