-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for the management of the IFC specification. #10
Comments
Hi Ian, Nice proposal! The benefits of this type of a system has been identified internally prior to this. The main blocker is the amount of work that would be required to replace the existing tooling that they generate the IFC schema documentation (IFCDoc tool) with a system that would support git workflows such as automating the compilation process using CI/CD pipelines.
The other issue that buildingsmart have is that the IFC standard inherits its processes from ISO. Buildingsmart have a committee in place that essentially facilitates a process prior to 'releasing' any new version of IFC. This is for quality checks and testing. I appreciate your suggestion that we could write tests for all of this stuff but this is a point that would need discussion. A further suggestion to this proposal that I would like to see is the way that IFC handles its version naming as the ecosystem could benefit from a semantic versioning system for clarity to developers working with tools that are dependent on specific schemas. |
Sounds great. I would recommend not asking for top-down permission, but rather innovate at the edges—working with a few 3rd party developers/users at first and slowly working out from there. I believe 'discussion' is more fruitful relative to actual code and workflows, than trying to address and answer all possible contingencies beforehand. |
@bigdoods Thanks! I realize that there are many conversations which I was not privy to in which the above proposal was discussed. I'm playing catch-up a bit. I realize that integrating the documentation generation in the CI/CD pipeline might be a good sized task. The fact that this concern has been raised by more than one person (in offline conversations), and now yourself, only makes me more certain that this work should be undertaken. Having a documentation process that is disconnected from the development and build pipeline leads to things getting out of sync, and keeps projects from moving forward rapidly. With regards to semantic versioning, I agree. I did get one bit of feedback that semantic versioning may not work with regards to BSI and ISO. I would need someone with more knowledge about those requirements to comment. I would also agree with the comment by @jmirtsch on Twitter that we should include all past and future versions of the specification in the |
Hi Ian, saw an email on this and had some thoughts... I agree that GitHub is probably the best tool for maintaining source code across distributed teams, and also works well for maintaining standards that are fully self-contained. This got me brainstorming... IFC in its current form is comprised of a static/early-bound schema (currently represented in EXPRESS and XSD), along with dynamic/late-bound schemas on top in the form of property sets (name/value pairs more or less) and constrained instance graphs (aka "model views" with rules stating things like an IfcBeam should have its shape represented with IfcExtrudedAreaSolid). These dynamic schemas can also be represented in files -- ifcXML for property sets, mvdXML for instance graphs. That said, the combination of these files in separate places may make it somewhat confusing and disjointed if editing these separately. And the nature of these files being in disparate niche formats means that off-the-shelf tools (e.g. Visual Studio) can't keep everything in sync (i.e. automatic refactoring, compiling source code to ensure validity) and drives the need for custom tools to enforce referential integrity. And all of this presents additional learning curves for software developers. Looking at the issue database at www.buildingsmart-tech.org/jira, it seems most issues don't really impact EXPRESS, but property sets and usage (instance graphs). Thus, the static schema (EXPRESS) itself only captures a small subset of what IFC defines. So I would contend that to use code as the "master" that drives the IFC specification, such code needs to go much further than what is in EXPRESS. Or else if EXPRESS is the "master", then it would need to be annotated in some way (e.g. some data structuring convention embedded within comments). So rather than the following that can be derived from EXPRESS today (flattened inheritance for illustration, assume everything public, property getters/setters):. public abstract class IfcElement such code would also include "dynamic" aspects most relevant in usage (defined in property sets and model views) and probably incur the most evolution such as: public class IfcWall : IfcElement /* relationships / /* representations -- captured at Representation */ /* properties -- captured at IsDefinedBy */ /* quantities -- captured at IsDefinedBy */ and similar for distribution elements having specific ports (e.g. PowerInput, ChilledWaterOutput), properties for performance, etc. Then custom attributes can be used to annotate conversion/serialization -- e.g. WCF DataMemberAttribute to indicate STEP serialization order and XML naming, Description for attribute documentation (at least for English / default authoring language), Category to indicate property set names, others invented as needed. C# would seem to work well for such purpose as it supports extensible custom attributes, has APIs for accessing all of this programatically, is probably the most widely used by AEC software developers, and can still be leveraged to generate code in other programming languages. That said, for a schema that describes geometric concepts such as IFC, diagrams and detailed descriptions are often needed, which may be better maintained as separate PNG and HTML files. Then for localization, .NET relies on resource files for that, which perhaps may be more suitable than in-line attributes to take advantage of editor support. For EXPRESS rules -- perhaps they could be replaced with C# (maybe automatically, maybe in both directions) and made more powerful to take advantage of the full C# language and libraries of .NET. With such approach, IFC major versions could be major branches, and what we call "model view definitions" could essentially be sub-branches that tack on specific usage such as shown above. I would think we could fully support the exact same semantics we have today in the exact same formats -- just storing these in code in a more concise and organized form on GitHub. Tools such as IfcDoc could still be leveraged as needed (maybe support GitHub directly instead of just local files), though perhaps used to a lesser extent as much could be done by editing and compiling code directly. For compatibility, it would seem we'd need to preserve the current generic structure of IFC for some time, though if the industry shifts to using such code base as input, then perhaps a future version of IFC could define data serialization in XML or JSON in a more readable form with direct attributes such as above. That said, even if we can make code more accessible to a wider user community in this way, that doesn't mean we have to change the process for how major extensions are done (developing use cases, producing real-world models with representative data, comparing alternatives, etc.); there would just be another avenue (and a more direct one than Jira) for incorporating recommendations and changes. |
@timchipman Thanks very much for this "brainstorming." If I am understanding your proposal above, you're suggesting that the base format for the specification is C# as opposed to EXPRESS. If this understanding is correct then I disagree on the basis that a specification like IFC should be defined in an intermediate language. I can imagine a couple of scenarios in which choosing one language to be the standard could cause big problems. First, you might represent something in C# that would be represented quite differently in another language. The example that comes to mind is IFC's concept of a SELECT. In I would turn your proposal around and suggest that we represent everything in EXPRESS. I'm saying this knowing that I don't have a full understanding of why different parts of IFC are represented in EXPRESS, xml, json, etc. Someone with more of an understanding of the history could probably tell me why these different formats exist. But, it seems logical to me that you could represent property sets, at least, and possibly model view definitions using EXPRESS. Than you'd have one intermediate language which would use the same grammar, parsers, etc. EXPRESS might not be great, but there's functionality in there that IFC has not yet tapped, which could improve current workflows and help in the organization of the spec similar to what you've proposed above. For example, currently there is no distinction in the IFC specification of what "schema" an ENTITY or TYPE belongs to. This makes it hard to correlate with the documentation which organizes things by schema. We could split the spec into schemas corresponding to what's shown in the documentation, and use EXPRESS's I agree with the premise that having different formats for the specifications leads to a sub par developer experience. However, many developers don't use, and will never use Visual Studio. As an example, I've written the entirety of I think I need a history lesson about the different parts of the specification and why they're written in different forms. And, I need an argument, if one exists, for why EXPRESS should not or can not be the neutral format for describing IFC. |
Hi Ian, |
ISO 10303 consists of several parts. Part 11 is EXPRESS modeling language, Part 21 is data format (IFC spf format), Part 28 is XML format ( XSD for model and XML for data( ifcxml format )), Part 22 is SDAI (Standard Data Access Interface) that defines how to make programming language binding, Part 23 is C++ language binding. As Ian pointed out, EXPRESS has some unique features that may not be covered in all programming language. Language specific binding should use some logic in order to map everything in EXPRESS, such as Select Type, Multiple Inheritance (we don't have this in IFC but EXPRESS has). There are user defined entity data type in EXPRESS, which doesn't belong to a schema. The format is like @EntityName, for example, IFCDOOR is an entity data type belong to IFC schema, and @MYIFCDOOR can be a user defined entity data type. We don't use this in IFC, instead use property set for everything that is not covered in the schema. |
@donghoon I had to go back and read your posts a few times to try and absorb all the information. Thanks! I suppose that one reason I implemented IFC-gen using EXPRESS is due to its compact representation and legibility over the XSD version of the spec. I actually began with XSD, using the |
On the history of IFC, I'm a relative newcomer, but from what I can piece together by looking at older versions and talking with others involved earlier on, the evolution of IFC went something like this:
So essentially there's always been a balance with keeping a separation between a stable core schema, and flexible schemas on top that can evolve without breaking the core. With compatibility as the number one feature, the structure of IFC is not as simple as it could or should be. At some point maybe these flexible schemas on top could stabilize such that the "core" could be expanded and frozen. With the STEP format, this hasn't been possible, as compatibility requires attributes to be in a fixed order, which is probably why a lot of the definitions that weren't there originally were defined using objectified relationships (IfcRelDefinesByType, etc.) as a workaround for compatibility instead of using direct attributes. Other formats like XML and JSON using text aren't impacted by that. If a toolkit can emerge that gets used everywhere, and provides automatic upgrade/downgrade between IFC releases (rather than software dealing with STEP files directly), then perhaps compatibility constraints can go away. On the topic of using EXPRESS or C# or some other language... for working with STEP format or generating code for programming languages based on the core schema, EXPRESS certainly makes sense for that. If the end goal is to put the schema on GitHub in an organized directory structure that reflects IFC comprehensively (core schema + property sets + model views + format configurations), and in a form that can draw in the widest audience of software developers, in my view it may be beneficial to consider more mainstream alternatives. An unscientific comparison suggests that there are at least 1000 times as many software developers familiar with C# compared to EXPRESS (another problem is that there's no good search term): Though EXPRESS isn't rocket science, there are many who dismiss IFC because of the unfamiliarity or perception that it is "old", or that there's a lack of tools that can support it. Not to suggest that C# is necessarily the best or will be as common 5 years from now, but it is relatively well known, and can be used with free tools on multiple platforms (not necessarily Visual Studio). A challenge with EXPRESS is finding tools to work with it without paying thousands to a few specialized consulting companies. If developers are to make changes to .EXP files on Github, then they will need a compiler somewhere to ensure there are no errors. From a technical standpoint, C# custom attributes make it possible to capture just about any additional information for data structures that can be used for other formats or programming languages -- inline with respective data definitions, avoiding the need for separate files scattered about, and leveraging a single tool (the compiler) to ensure validity. As far as EXPRESS mappings go, the EXPRESS SELECT construct maps very nicely to "interface" in C# and Java (just that references are formed in the opposite direction); EXPRESS defined types map nicely to "structure" in C#; not so well for Java with heap allocation). In summary, I'm not sure any of the above matters if the scope is generating boilerplate code in programming languages. Though if the endgame is to update/maintain IFC and to leverage the widest participation on GitHub, then there might be other things to consider doing at the same time to make that better. |
@timchipman Thanks again for helping to fill in the gaps in my history of IFC. I would be interested to see how a SELECT maps to an interface in C#. My understanding is that a SELECT is used to represent one of several possible types. As such, I'm unsure how a property on a C# class, which is of a SELECT type could be one of several possible interfaces. Certainly one could use On the topic of using C# as the base language, I would love it if it were as easy as picking the most widely used language in AECO and doing everything in that language, especially if that language is C#. But it comes down to separating the representation of the data model from the implementation of the same. The authors of IFC chose, I believe correctly, to represent the data model in a language built for that purpose, and that purpose only. The argument about IFC feeling "old" is also well taken. I have thought that myself in the past. But moving to today's version of C#, let's say it's 6.0, will mean that the implementation looks "old" when we're all using C# 13.0, and the ISO committee has refused to let us move forward for any one of a billion reasons (provided we're allowed to represent this standard in C# in the first place). If we take the fact that IFC's specification looks "old" as a given, then we won't be put in a place to bet on the language that won't look old in several years. If we conduct a similarly non-scientific analysis about the change in rate of adoption of programming languages we'd probably find, to our horror, that we'll all be developing in javascript in a few years. I would argue that IFC looking old has to do with the fact that it is old. It takes a long time to agree on changes to the spec, get those changes approved, and for those changes to appear in client libraries. Years perhaps. Improving the first parts of that process are outside the scope of And without belaboring the point about why C# might not be a great choice, I'll say that attributes as a strategy to extend C# to convey the entirety of IFC might end up in a mess. I've been involved in authoring a code base where we used attributes extensively to layer meaning onto classes and properties and it quickly runs wild. If the primary concern is issues like attribute ordering in other serialized representations (xml and son), then we should consider why this is such a big deal. IFC 5 changes the order of attributes. IFC 5 client libraries expect attributes in different order. If backwards compatibility is a requirement for your software, then ship the libraries for reading IFC 4 and IFC 5 with your software, and provide migration logic to upgrade to the newest version. Or, and this is really the solution that everyone seems to want, stop using a format like STEP which encodes constructors, and start using a format which encodes entities. With regards to engaging developers, I agree that there will be hesitation to jump in at the level of EXPRESS. There should be! Making an IFC implementation in a particular language is obviously important to you. Important enough perhaps to learn enough about its canonical representation and to ponder the many ways in which you could probably implement a SELECT incorrectly (as perhaps maybe I have done already :) ). But I think there's probably a much smaller number of people who will be implementing parser logic than there will be contributing to layers on top of the generated code. As an example, I've suggested to Jon Mirtschin that he implement geometry gym's API on top of the As a final remark on the C# class example provided above, I'll give a little peek inside how Autodesk is thinking about defining types for the building industry. With Revit we have a very strictly defined set of Element Categories which have a fixed set of properties which can be extended with Shared Parameters. This has never been flexible enough for our users who just want to create a thing that they call a "Foo" and give it whatever properties they want, then be able to set the visualization style of Foos in the same way that they would be able to for an Element of a built in category. Consistent and logical, but inflexible. On the opposite end of the spectrum you have Flux where you can put a bunch of numbers in the cloud and call them "Foos", then explain to all your team members what a Foo means, and every consuming system is made to represent a Foo in its own way. Flexible but not consistent. Baking properties into types in C# (if I understand that as the proposal) is closer to the former. The industry benefits from something right in between these two ends of the spectrum. I've come to learn that IFC is very close to this, allowing property sets to be bound to Building Elements using relationships. If the problem is simply that there is no industry standard property set for "Walls", then I think that's a battle that could be won in a way other than embedding those properties into types themselves. We need to propose and standardize the property sets which could have a representation as a type in C# for use in a relationship binding in |
This proposal relates to the ongoing management and development of the IFC specification, and the auto-generation of conforming IFC software libraries and documentation. Because the IFC specification does not currently have a repository where this discussion can take place, I'm putting it here as the IFC-gen project would necessarily need to integrate with such a repo.
The primary assertion of this proposal is that the IFC specification should be managed as code, and that the generation of conforming IFC libraries in multiple programming languages, and IFC documentation, should be automatic and triggered by changes to the specification. The specification should be maintained in
EXPRESS
as it is the most compact intermediate format and is the one on which the IFC-gen tool relies. As a straw man, I would propose that the IFC EXPRESS specification be stored in anIFC
repository on GitHub.The IFC Specification
The EXPRESS version of the IFC specification should be committed to a repository. The repository's branches would be labelled according to their version of IFC. For example, the
IFC4
branch would contain the most recently released version of IFC.Groups interested in extending IFC could then fork the IFC repository, amend the IFC specification as required for their project, and use IFC-gen on their version of the specification to generate a library in their language of choice. The resulting library could then be used in their service or application. When they are satisfied with their extensions to IFC, they would issue a pull request against the branch on the IFC repo into which they would like to have their contributions merged. The submission would be reviewed, code review comments would be picked up by the submitter, and the code would be merged. During this review process, the community would be able to see and potentially add to the discussion.
Requests for enhancements to IFC could be handled like issues in code. Any user could submit an issue, relating it to a particular release. When that issue is closed, as the result of a change to IFC, the issue would be closed with a note relating it back to the version where the issue has been addressed.
Continuous Integration
Every commit to a branch in the IFC repository would initiate a Continuous Integration System to generate IFC libraries in supported languages (using parsers generated using IFC-gen), and generate documentation to match those libraries.
High Level API
By itself, IFC is not a user-facing API. Tools like the Geometry Gym IFC tools, use a higher-level API that is designed for user-comprehensibility. This would still be supported. In fact, the IFC repository and IFC-gen would provide standard tooling on which these higher-level APIs should be built. Discussion of the standardization of a higher-level API for IFC are outside the scope of this proposal.
Databases
EXPRESS
is a data modeling language. As such, it does not, and should not, care about where you will be storing your data. For that reason, a conversation about storage mechanism for IFC (in memory, databases, etc.), and the affect that those storage requirements might have on the IFC object model, are also outside the scope of this proposal.The text was updated successfully, but these errors were encountered: