A Fictionalized Debate Regarding Advanced Data Model Approaches
Chris Allport
Co-Founder Skayl, LLC
June, 2019
Executive Summary
As we further explore the utility of Model Based Systems Engineering (MBSE), we continue to discover new levels of misunderstanding and confusion around what we, as data modelers and other systems engineers, expect a “model” to look like and how we expect it to perform. At first, it seems we are referring to one and the same concept. Digging a little deeper, however, we begin to realize that there are significant differences in how data models and “conventional” models should be designed, maintained, and used. Though this paper provides an unconventional (and hopefully, entertaining) break from a traditional academic paper, the fictional characters come to the same conclusions many of us have come to in the real-world. Data model designs best-fit for one purpose are often not best-fit for another. Technical sidebars throughout the paper are provided to draw parallels between the court case drama and technical application of the FACE™ Technical Standard.
Opening Arguments
“Sorry, your honor, it’s just that salad dressing is delicious, but it’s for salads! We use ketchup and mustard for hot dogs. It doesn’t make salad dressing wrong, it’s just out of context.”
The judge relaxed but stopped short of smiling. “Mr. Damel, need I remind you that you are here to defend against the heresy of your client’s ideas, and not those of the plaintiff? Objection overruled.”
The plaintiff continued. It was painful for Damel to listen to. He spoke so clearly and so simply, and everyone seemed to be following along with his reasoning. They didn’t actually seem to understand what they were being told, but also didn’t seem to mind. Opening arguments seemed to stretch on for an eternity. Damel’s client, Bespoke Data Model Solutions, had really challenged the status quo with their approach to solving the technical challenges. This time, however, they might have gone too far.
It was a large job for an even larger client. Hu Nu Tried and Tru Corporation hired Bespoke to handle the data architecture so their team could focus on the application code. Despite the advanced negotiations, the Bespoke team started falling behind and Hu Nu was looking for someone to blame. Since Bespoke was doing something new that challenged traditional thinking, they were an easy target.
So here we are.
The plaintiff’s attorney took a deep breath. “And so, ladies and gentlemen, we will show you how Bespoke Data Model defrauded the Hu Nu Tried and Tru Corporation, put our entire project at risk, and intentionally delayed making technical decisions so they can charge more at the end of the project for a rush job.” After slowly looking each juror in the eye and flashing a knowing smile, he sat down.
“Mr. Damel, you may address the court.”
Damel leaped up from his chair and emitted a deep baritone sound that sounded like a bassline from a classic rock song. Then, without pause, he started to sputter and hiss, which quickly developing a rhythm. The jurors were confused. Some squirmed in their seat wondering what was happening (clearly this is not what court was supposed to be like) while others smiled and tapped their feet. Damel carefully watched the judge and the other council. The judge’s pallor changed from red to purple and microseconds before rapping his gavel on the bench, Damel stopped. Judge Morose stared him down. “I trust you have an explanation for that…display.”
Now it was time for the other jurors to squirm. They stopped tapping their toes and sat more upright. Oddly, Damel didn’t back down.
“Yes sir, your honor…and ladies and gentlemen of the court. I know it is not traditional – and”, he turned to face the judge, “I certainly meant no disrespect, your honor. If it would not offend the court, I would deliver my entire opening statement in verse. Some of you would be entertained, some annoyed, and some offended. Unfortunately, some of you would be so put-off by my presentation, that you would fail to listen to my message.
“While my methods may be non-traditional, that does not make it entirely wrong. There may be completely valid content, but the delivery is not what you would normally expect and therefore the message is rejected without consideration. Ladies and gentlemen, I assert to you that this is exactly what has happened with my client. They have presented a solution that is so unexpected that it becomes an easy target for criticism. Different doesn’t make it wrong. It simply makes it…different.”
Damel sat down. The judge took off his glasses and rubbed his eyes. “Mr. Cole, you may begin.” And with that, the trial was officially under way.
The Plaintiff’s Case
Witness #1: the MBSE expert
The first witness to the stand was sworn in and introduced as a “subject matter expert in all things related to model-based software engineering.” The first round of questions aimed to establish credibility. With over 150 related publications (including a textbook), he was certainly quite qualified. He also looked very professorial…right down to the almost-but-not-too-aged cardigan sweater. He spoke very clearly and with each response, cited two or three references to back up his statements.
“Dr. Cooley, can you please explain model-based software engineering?”
The witness smiled, please to be given such an easy question. “Certainly. Software engineering is the practice of designing and building computer programs to ensure that they meet the requirements set forth by the customer…or user…whichever the case.” He paused to make sure that everyone understood and then continued. “In Model-based software engineering, we use ‘models’ to capture the interactions and details of the software. Instead of relying on engineers to read source code – that’s the language we use to write computer programs – we can use models to help with the development process.
Satisfied with that answer, the attorney continued, “It sounds like you are saying that the model directly relates to the development of the software.”
“That is correct.” Cooley smiled. He was pleased that someone appreciated his explanation.
“In the context of what you’ve already discussed, can you explain why my client’s project is in peril?”
Cooley leaned forward. “Well, it’s pretty simple. Although MBSE is supposed to make software development more efficient, it is still a laborious process. In traditional software engineering, we use a number of different measurements – we call them metrics – to determine the complexity and cost of project. One such metric is called SLOC, or source lines of code. This is the number of lines of software we expect it will take to implement the functionality. While there are much better metrics, this one is representative and facilitates understanding.”
“What does this have to do with the defendant?”
“MBSE can be compared to traditional software engineering metrics. If we were to estimate this job using SLOC, we can expect that this project would end up being a percentage of SLOC. In my experience, this should be close to 80%. This means that the model development should reflect a corresponding level of effort. Even if this estimate is incorrect by 20% - in either direction – that does not come close to the model development performed to date.”
Cole weighed the necessity of another question – leaving those numbers floating in the air for just a moment. “No more questions. Thank you, Dr. Cooley.”
Damel seemed concerned. “Let me see if I understand correctly. Normally, you would expect a model to grow proportionally to the size of the software development effort?”
“Yes.” Cooley relaxed in his seat, it was an uncomfortable thing to explain, but at least this attorney seemed to understand.
“And the size of the data model product produced by my client has not increased proportionally with the complexity of this effort?”
“Yes.”
Damel changed his tack. “Do you spend a lot of your time building models?”
“Yes, sir.”
“So…that requires a lot of time gluing together disparate pieces?”
“Yes, it does.”
Damel walked over to his briefcase and pulled out a cardboard box with a picture of a 1965 Ford Mustang on it and proceeded to dump the pieces on the table. “Sir, is this a model?”
“Yes…but it’s different,” Cooley replied.
“Fair, but is this a model?”
“Yes.”
“And does the work related to modeling consist of gluing together disparate pieces?” Damel was picking up steam.
“Yes…but still…”
Damel pressed, “I accept that this is different, however, would you agree that we just used similar words to describe very different processes?”
“Yes.”
“Okay, then is it possible that we have data models that we describe using similar words to describe very different processes?”
Cooley had a puzzled look on his face. “Yes…but we are talking about data models…not toys!”
“Correct! But physicists use models of the atom, correct?”
Cole interjected, “Objection! Witness is not called to testify as an expert on physics.”
“Sustained.”
Damel continued without a beat. “Dr. Cooley, in grade school, did you ever use Styrofoam balls to build a model of the solar system?”
“Yes.”
“Can you use the same materials to build a snowman?”
“Yes.”
“In your opinion, then, is it possible to use the same types of material to produce different products?”
“Yes.”
“Do you think that this same thing is possible for data models?”
“I’m not sure what you mean,” Cooley said.
“In your testimony, you indicated that the data models are being built for the purpose of generating source code. I am merely asking if you think it is possible to build a data model for a purpose other than the generation of source code. Do you think it is possible to build a data model for a different purpose?”
Cooley furrowed his brow, “Sure, I guess that’s possible, but what would you ever do that for?”
“Thank you for asking. I don’t know the answer, but I’m sure my client would be happy to explain. Thank you, Dr. Cooley. Your honor, I have no further questions.”
The judge dismissed Dr. Cooley and the confounded professor stepped down. He was lost in thought, trying to understand what that last question was supposed to mean.
Models for Documentation
In traditional MBSE (Model-Based Systems Engineering) processes, the data model primarily exists to be turned into source code. The model itself is another representation of the software in source code format. In some cases, the functional aspects of the software are coded into the models and the models themselves are processed into the application source code.
Data models built upon the FACE™ Technical Specification are an enhanced form of documentation that can be leveraged by computers to facilitate integration. Although these models can be read by humans, the primary goal is to make it unambiguous and consistent for a computer to read and process. While this may ultimately lead to source code generation, it is not intended to generate the application code: it is intended to generate the integration code.
Witness #2: the data modeling professional
Cole called the second witness to the stand. This witness, Jay “Coop” Cooper, was identified as a data modeling expert.
“Mr. Cooper, have you had the opportunity to review to the products the defendant has developed for my client?”
Cooper replied, “Yes, sir.”
“Would say these products are indicative of a useful data model? Does this data model appear to contain the information needed for model-based software engineering?”
“Well, not really.”
“Objection, ambiguous response,” asserted Damel.
Cole broke in, “Your Honor, this is a complex topic and I’ll ask Mr. Cooper to elaborate on his response.”
“I’ll allow it.”
“Mr. Cooper, can you please explain your answer? I do not see how the product can be both good and bad.”
“There is certainly evidence that a lot of work has been done, but, as Dr. Cooley stated, it does not seem congruous with the amount of work needed to build all of the software. Second, the product was very difficult for me to read and understand – there are much simpler ways to capture the information that is in this model. And finally, some of the documentation is unnecessarily complicated.”
“I see.” There was a smug satisfaction in Cole’s eyes. “Let’s break down each of these concerns for the jury. I believe we can forgo discussing the first point as that was covered by the previous witness. As for the second, can you explain the difficulty of reading the model?”
Cooper sighed. “It’s difficult to explain. The model captures all the information, it is just difficult to read. If you look at a schema – a schema is a simple way of structuring information – you can usually find all of the information in one place. While it isn’t necessarily easy to read, it is not difficult to find all the information.”
Damel seemed agitated but held his objection. He had a counter argument and just needed to be patient.
Cooper continued. “Which sort of leads to my third point.”
“About the model being unnecessarily complicated?” Cole reminded everyone.
“Yes, that. The model is used to document different concepts. So, when the model captures the temperature of a cup of coffee, it does it three or four different ways.”
Cole probed, “And why is that a problem?”
“It becomes ambiguous. There are multiple ways to refer to the same temperature.”
“Which is a problem because?”
“Think of it like this. When we write with words, we use different words that mean the same thing to add flavor to our writing. However, when we do technical writing, we need to be precise in our documentation. If we use different words; we look for the differences that word choice is intended to convey. Therefore, we need to keep the way we talk about things consistent.”
Cole felt the need to drive the point home, “and the model the defendant created violates this concept, correct?”
Cooper replied in the affirmative and Cole turned the witness over to Damel.
“Mr. Cooper,” Damel started, “I concur with your testimony. In fact, I daresay that it was spot on.” A quiet murmur spread throughout the courtroom but faded before the judge needed to react. “What you said is absolutely correct and I am not able to dispute any of your facts. Thank you for taking the time to understand my client’s model and providing such a sound explanation.” Cooper was smiling, pleased that he had done a good job. “The only problem is that you have the wrong perspective.” At that, the smile disappeared.
“Mr. Cooper, you asserted that the model was difficult the read, but you did, in fact read it. Is that true?”
“Yes?”
“Sir, you must have read the model in order to provide the analysis you gave in your testimony. So, did you read the model?”
“Yes, I did. But…it was difficult.”
“Correct, it is difficult, but let me ask you. Was reading the model difficult? Or was it difficult to learn the process to read the model?”
“I’m not sure I understand your question.”
“Okay, I’ll break this down into a few simple questions. Was there a consistent, repeatable process for reading the data model?”
“Yes.”
“And did you learn this process to perform your analysis?”
“Yes, sir.”’
“Did this process change after you learned it?”
“Sir?”
“Once you learned the process, did it change?”
“No.”
“And did the process consistently give the same result?”
“Yes.”
“When you first learned to read English, was it difficult to read?”
“Yes.”
“But over time…?”
“It got easier, but,” Cooper seemed to have an epiphany, “this process is repeatable even where English is ambiguous.”
“Thank you, Mr. Cooper. That’s correct.” But Damel still had to finish the point. “So, if the process is repeatable and yields consistent results, and if you can learn how to read it – even if the process is not abundantly simple – do you think it would be easy to write software that could read it?”
Cooper thought for a second.
“Absolutely. So, the model isn’t really intended to be easy for me to read – it’s meant for a computer to read?”
“Yes, sir. Thanks for the help, Mr. Cooper. And as for your final point, about the model ambiguity. Why do you think we would go through so much effort to build ambiguity?”
“Truly, that is one part that makes no sense, but you clearly have the same concept mapped to three different places!”
Damel smiled. “Are you sure? Do you have a thermostat in your home?” Cooper nodded. “Well, how many temperatures are associated with your thermostat?”
“One…no, two.”
“Please explain.”
“Well, the thermostat has to know the current temperature as well as the set point, or, the desired temperature.”
“Correct, go on.”
“That’s it. There are two ways to talk about the thermostat’s temperature.”
“I believe you are mistaken. There are two different temperatures to talk about.”
Cooper acquiesced. “Ah, so you need different model structures to capture these distinct aspects of temperature.”
“Yes, thank you, Mr. Cooper. Would you care to revise your earlier testimony?”
“I supposed. If, as you describe, the data model was not intended for a human to read, it would be easy for a computer to read. Although it seems like the same concept is mapped in three different places, it isn’t. There are actually several, slightly different, concepts mapped to distinct places. Although I hate to admit it, I did see the pattern – the data wasn’t randomly repeated. What’s more, if I need to review the model, it’s possible – even if it’s not easy. It would even be possible to use tools to help me read the model better. I guess that’s why we use diagrams. It seems like your solution is actually less ambiguous than even I thought it could be!”
“No further questions.”
Cole just sat there. He had produced two stellar witnesses and Damel had effortlessly dismantled their arguments. Actually, he didn’t really dismantle anything, he simply explained things from a different perspective. Cole was beginning to wonder how to approach the next witness when the judge cleared his throat. “Mr. Cole, I’ve never known you to be short of words. Do you have another witness, or do you rest?”
“Your honor, we have one more witness to call – Arla Champlain.
Structurally Unambiguous
Rather than rely on words (and more words) to convey meaning, these data models define the possible relationships between the different elements to which the system interfaces provide access.
Well-formed data models make use of repeatable patterns and structural differences. When coupled with constraints applied by the FACE™ Data Architecture, the documentation captures the meaning of something in one and only one way. No synonyms, no homophones, no amphiboly.
Since semantic meaning is encoded into reusable and extensible structures, the models are not extremely easy for humans to read. However, this format does make it rather easy for computers to process.
When computers can process documentation and when that documentation is unambiguous, the automated generation of the integration layer just becomes another data transform.
Witness #3: the data analyst
“Ms. Champlain is a data expert and we are calling upon her expertise in data construction and analysis. She has many years of experience in the database world as well as many more years of experience in the data modeling world.”
Champlain was sworn in, and Cole began with his questions. “Ms. Champlain, can you describe the analysis you performed for my client?”
“Sure, y’all,” Champlain drawled. Her southern accent did not seem to match her west coast style. “I was given two data .0models and asked to compare ‘em. And you told me that you’d be askin’ questions about that analysis.”
“That is correct. My client provided two different models to Ms. Champlain. For the record, we’ll call them Model A and Model B. Ms. Champlain, can you please describe the primary differences you noted in the two models?”
“I’d be happy to. The models were very different. Clearly, they were built by two different folks. There were two major differences that struck me. One of the models was very easy to read. There was a very clear and definitive mapping between two parts of the model.” Champlain paused. “And…well…this mapping wasn’t the least bit clear in the other model…but…but…” And then she just stopped.
“Ms. Champlain? Ms. Champlain!?” Mr. Cole waved his hands in front of Champlain’s face eliciting no reaction whatsoever. She just sat there, staring at the wall. Cole turned and looked at his client, “Has she done this before?” His client just shrugged.
A noisy chatter started building in the courtroom and the judge moved to restore order. The instant his gavel smacked his bench, Arla Champlain jumped and squeaked out a frightened “ooh.”
“Uh, sorry, y’all. I was just thinkin’ about something.”
Cole was relieved that his witness recovered. “Are you okay, do you need a drink or something?”
“No, I was just thinkin’ about that model. Somethin’ doesn’t make sense.”
Cole scratched his head; this was not necessarily going in the direction he expected. “Can you please explain?”
“Well, thinkin’ about the models, there’s something counterintuitive.”
“I not sure I follow: can you please explain?”
“The way I read it, both models capture the way the software communicates about the information. Champlain started drawing circles and lines in the air that only she could see. “However, the models use them very differently. In the one model, this information is all mixed up. It isn’t clear from a cursory glance at the model that it makes any sense at all! However, the other model, at first glance, looks perfect.”
Sensing a trap, Cole directed his witness to a different point. “Appearances can be misleading, can’t they, Ms. Champlain?” Champlain stopped with a quizzical look. “You said there were two things?”
“Uh, yeah, the other model has some really weird structures. It was almost like it had proxy objects all over the place.”
Cole had stumbled back inline – he had heard this complaint about the defendant’s model. “Can you please explain what you mean by a proxy object?”
“A proxy is something that takes the place of something else. There are a lot of instances where the connections in the model point to a proxy instead of the actual meaning of the data. It’s just, really strange. Why point to something else when you can just point to the actual thing!?”
“How does this affect model quality?”
“Model quality? Once again, this is ambiguous. Now, instead of talking about one specific thing, you might be able to talk about one thing while you mean another! I would say that’s pretty poor model quality.”
“And to which model are you referring?”
“Model B.”
Cole enthusiastically spoke up. “Ladies and gentlemen” – coup… – “please note that Model B was the model” – …de… – “developed by the defendant.” – …grace. “Your witness, counsellor.” And with that, Cole victoriously plopped down in his chair.
Beaming, Damel stood up and approached the witness. “Ms. Champlain, you were about to explain how the appearance of models can be deceiving. Would you please continue that explanation?”
Champlain smiled sheepishly, clearly pleased to be able to finish her earlier line of thought. “Sure, as I was saying, Model A – the ‘control’ model – was really easy to read while your client’s model – Model B – was really confusing. However, the beauty of Model A was only skin deep.”
“How so?”
“It’s simple. Model A, with its clean mappings, is really only good for one thing. And something that it was particularly guilty of was duplicating data. It seemed to repeat the same ideas throughout the model in numerous locations. I suppose this is alright if you have a bunch of straight-through mappings, but then it’s limited. Ya know?”
“An earlier witness testified that data duplication was bad because it led to ambiguity. Would you agree with this?”
“Without a doubt. But in this model, it works. You see, this model, is just two copies of the message model. It doesn’t really capture any additional information other than a copy of itself. Therefore, half of it is completely unnecessary.”
Damel pressed, “but that extra content could be used for other models, correct?”
“No, not really. I mean, it’s possible, but you are going to end up with lots of ambiguous connections that don’t make any sense. And then you are going to run into that problem of duplication. If you want to capture the position of the ship, which position – and which ship – do ya point to? But the other model – it doesn’t do this. I mean, don’t get me wrong, that model has some weird structures, proxy and all, but I can see where it would have utility into the future.”
“Thank you, Ms. Champlain. While it might be foolhardy for me to questions my own client’s model, you said that it was full of unnecessary stuff. However, you didn’t say that it was duplicated. Does this relate to the …proxy… problem?”
“I didn’t say the content was duplicated because it isn’t exactly duplicated. It’s ambiguous. If the model is documenting the position of a vehicle, why would the model point to the position of a something else? How do I know which one?” Champlain’s voice got weaker and trailed off toward the end of her question.
“Mr. Champlain, did you have a question?”
“No, sir, I believe I owe you an apology.”
“For what? You’ve simply made observations about a model and are here to present your results. Did you have some insight you care to share?”
Champlain shifted uncomfortably in her seat. “It just that, it’s about observations, isn’t it? The system itself may have a true position, but there may be a sensor that detects a slightly different position. And…there is some sort of relationship between them…but they are not the same thing. So the data model is actually able to capture this nuance.”
Damel gave a kind smile. “That’s correct, Ms. Champlain. But you also mentioned that my client’s model was full of extra stuff. However, you didn’t call it redundant. Can you please explain what you meant?”
“Your client’s model has a lot of content. I really only needed a really small fraction of it to do anything useful. It seems that so much of the model is not necessary…oh…unless you are going to use it again in the future! Then you can have extra data, you just use what you need! And that’s why you have all of those weird connections, so you can express things in terms of the real world, not just one person’s view of the world!”
“Could not have said it better myself. Thank you, Ms. Champlain. No further questions.”
Cole was beside himself. He needed to salvage the case but was out of witnesses. Each of his witnesses brought a sound perspective and repeatedly affirmed his client’s case. There was no point in the process at which he thought he might lose. Yet, by looking at the problem from a slightly different perspective, Damel had captured every fact in his favor. Every point that Cole had prepared for his closing argument had been countered before closing statements even began. Cole wasn’t feeling so well…
Modeling Disciplines & Model Quality
A data model should not be a direct reflection of the interfaces it is intended to document. Models constructed this way often result from an incomplete understanding of the reasons for having a data model. They are often constructed from a blind adherence to a requirement to produce a conformant data model. These data models have very simple documentation and (typically) violate the aforementioned uniqueness constraints.
While it is entirely possible to create a conformant model using this process, the utility of the resulting model will be limited to generating stub code and will not facilitate integration in more sophisticated systems with multiple, disparate system interfaces.
Conclusion
As we continue to explore the utility of models, we discover more confusion between the fit and purpose of data models versus conventional models. Until all parties fully acknowledge the depth of the differences in purpose and, therefore, the necessity for the application of fundamentally different methods, we will continue to be locked in a nonsensical debate where each party speaks past the other.
As practitioners, we must all be aware of how others are perceiving our work. That is, when others expect that our models are intended to encapsulate application code, they will be grossly disappointed. A significant level of intentionality is needed to discern how others expect the model to be used. As explored above, we use many of the same words to describe different concepts. Although both parties think they are saying the same things, the meaning they communicate does not match.
ความคิดเห็น