|Title:||Notes from the SC34 Meeting, Baltimore, 6-9 December 2002|
|Project:||All SC34/WG3 Projects|
|Status:||Meeting notes (unofficial)|
|Date:||13 January 2003|
|Distribution:||SC34 and Liaisons|
|Reply to:||Dr. James David Mason
(ISO/IEC JTC1/SC34 Chairman)
Y-12 National Security Complex
Information Technology Services
Bldg. 9113 M.S. 8208
Oak Ridge, TN 37831-8208 U.S.A.
Telephone: +1 865 574-6973
Facsimile: +1 865 574-1896
E-mailk: mailto:[email protected]
Ms. Sara Hafele Desautels, ISO/IEC JTC 1/SC 34 Secretariat
American National Standards Institute
25 West 43rd Street
New York, NY 10036
Tel: +1 212 642-4937
Fax: +1 212 840-2298
E-mail: [email protected]
Notes from the SC34 Meeting, Baltimore, 6-9 December 2002
Status: The following notes were prepared by Patrick Durusau, [email protected] during the meetings of SC34/WG3. These notes do not have any official status, conferred or implied by their presence in the SC34 archive. They are intended to serve as an aid in recalling issues and discussions from those meetings and are not intended to be a complete record of those meetings. Decisions of SC34/WG3 are made in accordance with ISO policies and procedures and as such, are recorded solely in the offical minutes of that body.
(Lars) One topic B with a subject indicator that points to subject A. Is it a subjectindicatorRef or a srcloc (after merger)? Three possible resolutions, 1. Make it subID, 2. make it srcloc, 3. let it be both. subjectIndicatorRef should be a topicRef. if use a topicRef, ends up as srcloc. This means 2 and 3 would be the solutions. Lars, should go with #2. better to go with topicRef. do it as srcloc.
When have locator item [notation] = "URI' [address]="http://www.somewhere.com"] standard says only one defined location syntax. if use your own, must use x- prefix. Martin Bryan, suggested that IETF will be talking about International Resource indicators (IRI) and need to add HyTime syntax. Do we need a mechanism for introducing new notations? Steve, do we actually need this? Could make it a published subject.
Can reify anything except a topic. Not possible now because it implies merging. with subjectIndicatorRef to refer to the topic. Using a resourceRef to address an Association (biggest outstanding issue).
Proposed Resolution Should topic items have a reifier property and the answer is no. (Consistent with reification in general. See term-subject-address-def for further discussion) Definition of reification needs to be revisited. 3.4.4
Japan's comments, more background needed. Section 2.1, issue related to string types. Matching of strings for merging. Text says a string is sequence of Unicode characters. OK, but can be written in different ways. XML 1.1 says use Normalization C. Lars, four choices, 1. NFC, 2. NFD, 3. Normalization and don't say which one, 4. Ignore the issue. current text says 3 but not which one. Export from C has benefits because it follows XML.
Should we distinguish between properties that are containers and those that are references? Or let people work it out. Roles on topics and roles on associations, both containment, but can't implement that way. Have role in two separate places. (Distinguish between properties which have containment semantics, and those which are references?)
Is the thing reified by a topic a characteristic of it? Such as reifying a baseName, does the baseName become a characteristic? Really in the subject world and not in the topic world. Not part of the topic map at all. What side of the fence does reification fall? Lars, say it falls on the machinery side.
Martin Bryan - subject identity can be inferred? ISO 13250 states that subject identity may be "inferred from the topic's characteristics." Does SAM need words to the same effect? Lars, a question of wording. SteveP, would you want to do merging on something other than characteristics?
Is a base locator property on the topic map item needed by other specifications? Was a way to locate a topic map. Kal and Graham wanted it removed. Lars sees the following issue: composed-by($A, puccini) query, topic-ids, need locator for higher level standards? base-locator and source-locator on topic map, no notes on Graham's objection
SteveN, subjectIndicator that points at the topic ID everytime it occurs. SteveN -> doesn't think we should worry about serialization, not in our scope, if we say what the syntax means, that is enough, have provided for interchange. only concerned with what topic map means -> why would this standard be concerned with conformance clause -> Holger -> to define conformance -> Lars -> look at XTM syntax -> conformance (section 4 of XTM) -> SteveN -> not proper requirement for conformance -> making assumption that is not correct -> must perform logical equivalence testing to determine conformance -> dispute is about what is the canonical syntax to be used. Lars -> whole point of canonical syntax arising from SAM, know what is logically significant or not.
Lars - <topic id="a"*gt; = both sourceloc and subjectid, whereas subjectIndicator = subjectID, question is do we need source locators. Logical equivalent ignores subject locators. Holger -> if don't do serialization do we give enough guidance, Lars -> wants to have an informative annex on serialization
Proposed Resolution Can be resolved by saying display name is needed in order to have non-textual display for topics. Means we will keep the PSI and it is not simply for backwards compatibility. 5.3 remove last sentence of second paragraph, extend what is now the last sentence: "...in context which are indicated by additional parameters on the variant name."
The presence of a TMCL schema may allow applications to improve the result of merging topics/topic maps by providing enough information to allow implementations to do additional transformations and reduncy removal. How should the SAM specification deal with this possibility? Holger -> more of an issue of duplicate suppression. Example, SteveP born in London and another topic map says SteveP born in Oslo, if required to only have one place of birth, a conflict that should be noticed.
If a locator syntax allows equivalent locators to be given different syntactical expressions normalization must be applied in order to take this into account. Where should the text that sets out this requirement go? Does it belong in this document or in the syntax specifications?
Holger Rath: Has been on hold waiting for the SAM. About to revisit the requirements since the SAM should appear next Spring in London (XML Europe 2003). Listed existing TMQL languages. There is a mailing list for TMQL at Yahoo groups.
New capabilities for TMQL (Holger and Lars): Not one big thing, but separate parts. One module the query part, insert and update part, may have something similar to XPath, could have TMPath (don't know if it should be part of TMQL or some other separate part). Lars: lot of use to build topic map driven websites, with TMQL have a standard for queries, but could also have TMSLT to shape the output part for the website. People not satisfied to just display name of the topic, string operations, etc. to shape the results. Ann: Continuing UK concern, original 13250 aim to handle TM information embedded in documents. Lars: still collecting use cases for TMQL. SteveP: should solicit proposals. Lars: should have a workshop, for presentation of proposals at XML Europe 2002.
Instructions to Editors Prepare for workshop XML Europe, produce new draft of the requirements and seek use cases. Proposals and Use Cases by March 17, 2003 submitted to the TMQL discussion group, http://groups.yahoo.com/group/tmql-wg or the editors (Lars Marius Garshol, [email protected]; H. Holger Rath, [email protected])
Steve Pepper: several options for revising ISO 13250. Ann Wrightson, are you saying update the roadmap? Michel, need to find a framework for the roadmap. Pepper, why are we doing this? Newcomb, can't we derive this from requirements for SAM and RM. Lars, should we allow ourselves to add types to names, etc. Newcomb, how much are we going to put on the table? Mason, if NP for new addition, revision, opens up, potentially, the entire standard, however, may be the most likely way to deal with the issues before us, SAM and RM are rather large additions to the standard. Could go to a multi-part international standard. Pepper, then talking about a revision. Michel, concerned to progress by steps to make the standard stronger, do not open everything up for discussion. Michel, consider SAM and RM as additions. Mason, SAM and RM have no official status, spending time on things not in our scope. Mary, need a certain formality to the proceedings. Pepper: don't have to give the impression that everything is up for grabs. Formally speaking it is possible to comment on any part of ISO 133250. Need to make specific what work is being undertaken. Mason: New Work Item proposal, has a list of documents to be considered. Reviews the NP document. Michel: multi-part, will be a revision. Mason: but multi-part directs attention to the new parts. Ann: talking about national body comments and not everyone possible. Michel: objects to use of "revision" while SAM and RM are not stable. Newcomb: Do what is necessary to put everything on the table except the syntax. Pepper: thinks that is a great idea. Would be a sign that it is not changing anything. Ann: put forth a new work item which is designed to contain the new stuff. Get a new number? Pepper: but existing stuff needs to be changed.
Michel: default standard is XTM syntax. concentrate on preserving that syntax. is a revision in a sense. Lars: restatement is the better term, people discover new concerns while using the standard. Mason: can as SC34, can reject a requested change. Michel: need to preserve backwards compatibility but also not add features that present users would not have thought of with the old standard. Pepper: should err on the side of restrictive. Sam, dare to do less. Lars, if going to do changes, need to say so in advance and say so in advance. Now is the time to change and should make all the changes at one time. Michel, might have to find new requirements from new implementations. Ann: only way to get stability for the syntax is to say we are not going to change the XTM syntax and develop a new syntax. Lars: Roadmap is in part to allow different syntaxes. Pepper: more of a restatement than a revision, start with HyTM and XTM. Ann: XTM should be kept but should not be in a position of being guarded from development. Sam, have the restatement task and also have the review process every 5 years. Ann, have things in the standard that were by ISO process, wants a successor to XTM syntax. Mason, Sam brought up the periodic review process, out of our scope of thinking right now, won't be up until 2007. does not initiate the process.
Proposed action This is a restatement of ISO13250 where we, 1. specify a data model (RM/XTM); 2. explain relationship between HyTM to XTM; 3. produce a canonicalization syntax to enable conformance testing; 4. fix bugs.
Newcomb: would be a way to kill HyTM if no one wants to work on it. Pepper, would have to re-word the standard to use the SAM. Ann: thinks the HyTM analysis would be very useful. Mary: is there an interest in HyTM, anyone using it. Pepper: encourage people to bring news of use of HyTM.
Pepper: status, approved as new work item 19756, draft requirements from July, 2001, have a couple of submissions, Ontopia is one of them, Pepper is project editor. Put on ice until progress on data model, since end of 2001. Is it time to go forward with TMCL? Lars: don't have a satisfactory requirements document, long list of questions that have not been answered by that document. soliciting use cases and proposals is premature. Newcomb: likes the division of XML documents from their DTDs, need the DTD only if you want to validate, TMCL document should not be required know what a topic map means. Lars: unavoidable that the schema will do more than just constrain, because presence of constraints provide information. Pepper: Issue of modularization is important. Michel: with interchange, what is the meaning of TMCL, Holger: data typing Lars: examples, sorting by height would be numbers.
Norway: Steve, saw it as meeting in Montreal reached five conclusions. 1. should be optional. 2. topic map authors should be able to indicate when TNC applies. 3. should be done at baseName level 4. should use an attribute of merge on or off 5. in SAM, if label not subject to TNC, if identifier, it is subject to TNC. Reviewed Montreal and decided only the first one was acceptable. allowing authors to specify whether TNC applies, if at individual baseName string, can create confusion sees it as a conflict.
Lars: the TNC in 13250 and XTM qualifies as a bug, machinery to make it optional is bug enhancement, use unambiguous property, TMCL should be able to declare rules for merging on whatever characteristics. mistake to do this with baseNames in scope. Leaves issue with theoretical topic maps that use TNC to establish identity, use TMCL to specify the merging rules.
Pepper: scope interferes with other things, must specify the scope before uttering the name. in practice utter the name and then choose from a short list. namespace aspect of scope creates conflicts. scoped by name of composer since two names are the same.
Lars: scope is overloaded because baseName does not have types, the real problem is the TNC. wants a core set of merging rules, does destroy addressing topics by name, who needs to address topic by their names,
Mary: real control vocabularies that are being developed by different groups in two different companies, Asia is different in different companies. must be one method of assigning identity in the company.
Newcomb: instead of merging to have two names in the same scope it is an error instead of merging. Wants to preserve ability to address topics by name. Lars: difficult to be sure that two topic are in the same scope and want them to remain distinct, why is it important to address topics by its name, and why do you want it.
Michel: can create namespace by using topic types or by roles. address by name, the usual way to address things is by name, used in natural language processing Newcomb: many companies have controlled vocabularies, reason for them is to address by name
Holger: to address a topic, use name in a certain namespace, rather address something by its identity, request something by its name, query to find something, should take TNC away and support for all characteristics.
Pepper: two requirements, author says this is name in a namespace, and to support homynyms. can we tie this into typed-names issue for the SAM, Ann: two related things, points at the illusion of topic maps, a subject that can be identified, by some means other than a name, remember started where knew topics should merge go together.
Newcomb: thinks identifiers are the same thing as names, arguing for unique identifier be preserved, critical feature to have unambiguous addressing. add types to names, would get what he wants. Could do the same thing with scope.
<topic id="EKN-foo"> <baseName> <instanceOf><topicRef xlink:href="#ENK"/></instanceOf> <baseNameString>foo</baseNameString> </baseName> </topic> should only allow one instanceOf and would allow it to have scope. <topic id="EKN"> <instanceOf> <subjectIndicatorRef href="http://[PSI for controlled vocabulary or unique property]"/> </instanceOf> ... </topic>
Lars: doesn't this create a uniqueness rule in prose rather than in TMCL. Mary: works for Newcomb and she can just ignore it. Holger: how to define when merging happens. Mary: how to use scope, would not be used to disambiguate names. Holger: what is the meaning if instanceOf is not in the baseName, if not there. Lars: will say what he does not like in writing, can't merge two foos if in same scope. Michel: not sure instanceOf is a good name. Holger: what is the influence of baseName instanceOf on variantName, may be redundant to TMCL, Pepper: need to offer guidance on usage
Should the subject identifiers defined by XTM 1.0 be retained as they are, or should new equivalent ones be defined to replace the originals? Newcomb, not happy with what we have, OASIS should provide a space for this. Standard should support PSI within itself. May be a use case for URN's. Lars: need to have someone go away and find a solution. Pepper: what is status of topicmaps.org. could transfer to ISUG. Mason: public text in SGML, Lars: gives a way to use URN's and something else. IETF has issued specs on resolving URN's. Michel: will need to keep old PSI's along with the new. Pepper: bigger problem with language and country codes, will have to re-write the PSI's. Could have a topic map with 8 topics that merges to support the PSI's
The definition of scope is different from that of XTM 1.0 and ISO 13250:2000, in that it explicitly says topic characteristics assignments are valid for each of the subjects in its scope individually. Is that acceptable?
Newcomb: scope is just one of an unbounded number of assertions about assertions, see Steve's email, Scope, again. annoying to have to define a class instance type to qualify a relationship. scope is an escape valve without defining an asertion type.
<Standard id="RDF"> <Name<Resource Data Framework</Name> <Name lang="fr">....</Name> <Used-In project="...">....</Used-In> Name become a role, Used-In would be the scope <occurs occrl="Used-In"
Lars: 1. Any Subjects, (XTM, 13250) 2. All Subjects (inconsistent with merging rules) , 3. Leave it open #1 is dead, choice affects the interchange of topic maps. current merging rules imply "all subjects"
Holger: three levels, lowest level is scope set itself, identity of scope level in RM (not true for SAM, only property of characteristic), do we care about that in the SAM; next level, merging, scope governed merging, related to general question of how to interpret scope, leave up to application, in context of SAM, this is a certain application of scope for merging; can do ALL and use 3 for interpretation
Proposed Resolution Should not define operations and the specification of merging should be re-written to a more declarative style. If declarative formulation is found to be hard to understand, we may keep algorithm in an explanatory note.
<topic ***> <baseName> id-"a"> <instanceOf>#b</instanceOf> <topic subject1 (pointing to "a" <instanceOf
Newcomb: subjects are primarily conferred upon nodes, node has a subject by virtue of playing a role in one or more assertions, a node's situation in the graph, is what gives the node its subject. assertions yield subjects to the nodes that are role players. how is the subject represented? nodes have properties, which are values? requires to say a type but does not have a type system. properties get values from their situationess, assertion type, roles played and applications' definion of what properties should be assigned to those nodes. Two kinds of properties, SIDP, subject identity discrimination properties. Ann: linguistic structure is primary and formal structure is secondary. What is merging? 1. have a topic map graph, some nodes inside assertions, others are not: merging process, looks at every node to determine it situation in the graph, graph must be well-formed, well-formed may have multiple nodes for the same subject, then calculate all the property values for all the nodes, all of the assertion types have been defined, 2. then look at values, SIDP, when nodes have the same values, if merger has occurred, then have to go back and start over again (since the situations have changed) Logic of the application drives the merging process. Whenever a user defines an assertion type, have defined a subtype of the SAM. Users may want to define merging rules for cases when certain types of assertions are used.
Newcomb: diagram, the node of interest on left, need to know the notation of the subject indicator, Theory of Resource Identity, Theory of Subject Indication, Addressing Scheme. What is the subject of the subject indicator node, gets its subject from its situation, because of its role, address <nmsploc...> Ann: but HyTime would not work with some systems, Eliot: HyTime addresses a unique physical thing in the entire universe, It has an address as well, (URI not sufficient for a binding point). URI's are not binding points unless the strings match. Punch line: SAM can bootstrap URI into stronger addressing such as the moral equivalent of a grove. Without enhancing the web somehow, can't do serious information management. Semantic web is doing work related to subjects.
At what level of interpretation does the topic represent the resource? Does it represent that storage location? The stream of bytes? The stream of bytes interpreted in some particular way? The standard must either leave the details open or clarify this. Note that it may be impossible to clarity whnt the interpretation is left undefined.
Must locators really refer to information resources? Some URN schemes allow resources tht are not information resources to be addressed. This affects the definitions of "information resource", "locator", as well as the [subject identifiers] and [subject address] properties.
Lars: logic usually used with set theory, graph theory builds upon it. set theory, member-of, subset-of, implies that sets are discrete objects, set theory forces you to be discrete, should be continuous subject spaces rather than infinite subject spaces. Marriology (sp?)
Newcomb: node demanders, define a syntax for interchange and deserialization procedure, can define things in the syntax that you want to point at in deserialization, can point at the node or information item that results from deserialization. can do the same as the baseNameString.
Architectual forms in XML, amendment and TC in the works, N1985, N1988, which consitute revisions to the standard. are they complete? Kimber: N1957 - Architectual forms by processing instructions only, the TC is completely different, that document is not complete, would require completely executive decisions or gets time from Pete to work on it. Kimber has only real HyTime implementation but no customers using it. amendment is not all that pressing, Kimber: treat HyTime as archival and move on. withdraw TC. Ann: considers TC would be beneficial to the standard. (N1988) Newcomb: thinks HyTime should explode into a family of standards.
Balloted and approved so could just publish it. Does not conform to current HyTime draft. Newcomb: has been delayed for seven years so what is another year? Kimber: SteveP says SMDL was out of variance with HyTime, can't really fix the syntax without a data model, worked with SteveN, would be quite solid, Kimber has a draft ready for SteveN to read. Newcomb: more a chance for mischief if publish it with problems. Ann: should not pursue non-commercially viable work in the ISO process, the priority should be given to topic map stadards, defer this work until after the topic maps standards are complete. Kimber: can file an interim draft
Newcomb: RM is an attempt to allow 100 flowers to bloom. the collocation objective (Elaine's book), all information about a subject is known from a particular point. The SAM is once instance of the RM but there could be others. Pepper: what is meant by ontology? Newcomb: can express John Sowa's ontology in the RM by specifying a TM App (the SAM should be one)
Pepper: no doubt that he is impressed by the RM (and PMTM4) leading us forward from ISO 13250, but for that reason, worried that we are in danger of letting topic maps continuing to evolve, do have to move onto the next stage. thinks we need to draw the line, but RM is taking us beyond topic maps. had a point when 13250 appeared that was well defined. proposes whether we should make the RM a new work item, with its own number, etc., Steve and Michel resign as 13250 editors and let Lars continue in that role.
Sam: objects to severing the connection. Michel: important to know what we are going to do. Agrees with analysis about what is happening with topic maps, but disagrees with conclusion. wants a stable standard, also thinks that standard cannot be completely fixed. don't need to simply protected vested interests in the current standard. not really interested in the position of editor, topic map standard should be one standard. SAM is a formalization of XTM, and RM is the mechanism that allows enabling something else.
Sam: not sure about the analogy between SGML/HyTime is the best one, SGML/XML is the better one, ISO lost control of SGML, issue is not one of freezing it, in the form of the SAM, preserving brand under ISO
Newcomb: issue is not what any of us are talking about, feels like the SAM needs the RM, to preserve the SAM. wrong headed from long term perspective, to create SAM with merging rules on assertion types that are built into the SAM, to restrict the SAM to merging rules limits the use of topic maps. SAM is something to inherit, may want to create assetion types
Newcomb: promote the SAM but have the RM for additional power. do we want to limit the brand name topic maps to a particular ontology. Pepper: agrees, do users need to understand the RM to use topic maps, great danger that we will kill the topic map paradigm. Michel: believes that RM does not have anything to do with knowledge representation as we speak of it in topic maps. topic map user, topic map implementer (must use the SAM) Newcomb: must get our story straight
Bernard: RM does not have to be an explicit part of the standard. Ann: if the RM is necessary machinery for the SAM, then place for it is in an annex. Sam: sympathetic to branding, are we going to restrict topic maps to the SAM assocition types. Pepper: goes beyond topic maps and should be a standard in its own right. could be used with RDF-2, possibly. need more time to develop RM, not in line with schedule for the SAM. Michel: completely disagrees with what Pepper said, pretending RM is above topic maps, standards work is 10% technical and 90% political, RM is a way to show interoperability
Draft reference model say 10 things an Application must do:
Newcomb: can't have multiple role players of the same role, such as two topics with the role type, Bernard, does not get the explanation, gets rid of all the issues of cardinality. if don't have a node that does not reify the set, you don't have a topic that reifies the node. discussion ensues about implicit sets, etc.
subjects and hypergraph is defined as distinct sets, HyperGraphTM, hypergraph is a set, dealing with sets, three distinct sets of hypergraph elements. 1. An element is either vertex, edge, incidence. must choose to be in one and not the others. 2. Every incidence links exactly one vertex and one edge. (this is the unique rule for hypergraphs) edge in a regular graph can only connect two nodes with one edge, hypergraph connects two nodes and an incidence, which are three nodes on one edge.
Nikita wants to say that a, b, and c are true friends, wants to use one role type that three people are friends. wants to say this with one role type, three different roles of the same type, could add a set assertion type in SAM to confer membership property which is also SDIP.
Example from Martin Bryan on the SC34 list, brother Martin and his brother Ian, two roles and one role type. this is in association land. Let's say brothers are always binary. this association type gets turned into an assertion.
Ian-left-brother-----AssertionType-----right-brother-Martin Martin-left-brother-----AssertionType-----right-brother-Ian Impact of existence is exactly the same. situation of the two nodes are symetical.
Newcomb: the RM assumes the applications have direct access to the graph, nodes and properties. SAM as written has different view, has a notion of reification, have to reify something in order for it to be an information item. behave as if they did exist, whether created as objects or just pretend they exist. Makes the SAM quite different from the RM. Should RM have an API as an optional feature, can also define optional API characteristics? Pat: can't SAM do less than the RM? Newcomb: yes. RM should say that the SAM can do it.
Pepper: anything else that affects the SAM? Newcomb: don't think so, #2 in the RM needs to be looked at in the SAM, RM says you must say what are the subjects. Pepper: should map RM to the SAM, agreed in the roadmap, Newcomb: agrees it should be done
Pepper: decisions on RM, Newcomb, probably premature. Newcomb willing to spear head work on mapping RM to the SAM. Should use the brothers example on roles. r-node as an issue, requirement #10, next meeting 7:30 PM Thursday night
This corrected draft prepared 12 January 2003.