The Truth About XML was: openEHR Subversion => Github move progress

Tim Cook tim at mlhim.org
Fri Mar 29 10:15:08 EDT 2013


Hi Tom,

I have amended the Subject Line since the thread has diverged a bit.

[comments inline]

On Thu, Mar 28, 2013 at 9:55 AM, Thomas Beale
<thomas.beale at oceaninformatics.com> wrote:
>
> one of the problems with LinkEHR (which does have many good features) is
> that it is driven off XSD. In principle, XSD is already a deformation of any
> but the most trivial object model, due to its non-OO semantics. As time goes
> on, it is clear that the XSD expression of data models like openEHR, 13606
> etc will be more and more heavily optimised for XML data. This guarantees
> such XSDs will be a further deformation of the original object model - the
> view that programmers use.

I agree with you that you cannot represent an object model, fully, in
XML Schema language.
However, you seem to promote the idea that object oriented modelling
is the only information modelling approach[1].
This is a critical failure. The are many ways to engineer software
using many different modelling approaches.
So abstract information modelling, as you have noted, does not
necessarily fit all possible software modelling approaches and it is
unrealistic to think that it does. In desiging the openEHR model you
chose to use object oriented modelling. The openEHR reference
implementation uses a rather obscure, though quite pure,
implementation language, Eiffel. I think that history has shown that
this has caused some issues in development in other object oriented
languages.


> So now if you build archetypes based on the XSD,
> you are not defining models of object data that software can use (apart from
> the low layer that deals with XML data conversion). I am unclear how any
> tool based on XSD can be used for modelling object data (and that's nearly
> all domain data in the world today, due to the use of object-oriented
> programming languages).

I think that if you look, you will find that "nearly all of the domain
data in the world" exists in SQL models, not object oriented models.
So this is a rather biased statement designed to fit your message.
Not a representation of reality.

That said, the abstract concept of multi-level modelling, where there
is the separation of a generic reference model from the domain concept
models is very crucial. Another crucial factor is implementability; as
promoted by the openEHR Foundation mantra, "implementation,
implementation, implementation".

The last and possibly most crucial issue relates to implementability,
which is the availability of a talent pool and tooling. In order to
attract more than a handful of users to a technology there needs to
exist some level of talent as well as robust and commonly available
tools.

The two previous paragraphs are the reasons that the Multi-Level
Healthcare Information Modelling (MLHIM) project exists.

MLHIM is modeled from the ground up around the W3C XML Schema Language 1.1 [2] .
The reason for this is that the family of XML technologies are the
most ubiquitous tools throughout the global information processing
domain today. There is a significant number of open source and
proprietary tools from parser/validators to various levels of editors,
readily available. While serious XML development is not taught in all
university computer science programs, every student does get
introduced to XML in some manner.

The relationship of XML with emerging knowledge modelling tools like
Protégé in languages such as OWL[3] and vocabularies expressed in
RDF/XML[4] is an obvious advantage. There is an enormous skills pool
available for using XML data with REST APIs and in translating XML to
JSON for over-the-wire communications. There are thousands of websites
with information on how to do these things. It is irrelevant which
programming language you choose to use; Java, Eiffel, Ruby, Lua,
Python, etc. there are XML binding tools and access to XML validators.
There are tried and true methods of storing XML data in SQL databases,
XML databases and NoSQL databases. XQuery and XPath are very robust
and well known. Another big advantage is having the ability to do data
validation using commonly available tools in a complete path; from the
instance data to concept model to the reference model to the W3C XML
Schema specification to the W3C XML specification,

With XML Schema Language 1.1, we have the ability to build complex
structures using substitution groups and do very intricate data
analysis and validation, across models, using XPath in assert
statements. All without having to resort to RelaxNG or Schematron.
There are also tools and experience in using XML Schemas to
automatically generate generic XForms for presentation and data entry.
So maybe we had to make concessions in deciding to use XML technology
in MLHIM.  However, I cannot think of anything that is missing at this
point.

In developing the MLHIM project we took the best ideas from HL7v3 and
openEHR. Then we sliced away the things that are not required to
maintain a robust healthcare concept modelling infrastructure. You can
get a copy of the XMind based template model here:
https://github.com/mlhim/tools.git There are detailed examples of
Concept Constraint Definitions (CCDs), instance data as well as
documentation for them (and the reference model) here:
https://github.com/mlhim/tech-docs.git There are many more places to
get inforamtion about and tools used for MLHIM and the various
sub-projects. A primary starting location is the MLHIM Community on
Google Plus http://gplus.to/MLHIMComm

The 2.4.2 release of the reference model is scheduled for release on
30 April, 2013. It has been informed by PhD and MSc projects as well
as other implementation projects. We are currently developing a
web-based tool to make CCD development even easier for non-technical
domain modelling experts. Access is currently restricted but if anyone
would like to participate in evaluation of the tool by build CCDs then
please send me an email privately and I can arrange to get access for
you.

The MLHIM eco-system supports development of CCDs without the need for
a consensus. The knowledge modeller has full autonomy to build models
appropriate for their application(s) ranging from small purpose
specific device apps to large institutional EHRs. There is no need to
be constrained by what has been developed previously.  Of course, a
good system supports re-usability of high quality components. But
fitness for purpose is left to the model user. More than 1500
published CCDs are currently available in the Healthcare Knowledge
Component Repository at http://www.hckr.net

I want to close by saying that I am grateful for the work done in the
openEHR community. In my more than ten years of involvement with five
years on the ARB, I learned a lot!  I learned how to do things right
as well as what can go wrong.  MLHIM represents those lessons learned.


Regards,
Tim


[1] - http://en.wikipedia.org/wiki/Software_modelling

[2] - http://www.w3.org/TR/xmlschema11-1/

[3] - http://en.wikipedia.org/wiki/Web_Ontology_Language

[4] - http://www.w3.org/TR/rdf-syntax-grammar/






============================================
Timothy Cook, MSc           +55 21 94711995
MLHIM http://www.mlhim.org
Like Us on FB: https://www.facebook.com/mlhim2
Circle us on G+: http://goo.gl/44EV5
Google Scholar: http://goo.gl/MMZ1o
LinkedIn Profile:http://www.linkedin.com/in/timothywaynecook




More information about the openEHR-clinical mailing list