+

WO2001093078A2 - Method and system for mapping between a source document and a transformation document - Google Patents

Method and system for mapping between a source document and a transformation document Download PDF

Info

Publication number
WO2001093078A2
WO2001093078A2 PCT/US2001/017544 US0117544W WO0193078A2 WO 2001093078 A2 WO2001093078 A2 WO 2001093078A2 US 0117544 W US0117544 W US 0117544W WO 0193078 A2 WO0193078 A2 WO 0193078A2
Authority
WO
WIPO (PCT)
Prior art keywords
transformation
document
file
data
xml
Prior art date
Application number
PCT/US2001/017544
Other languages
French (fr)
Other versions
WO2001093078A3 (en
Inventor
Zheng Min
Lily Han
Original Assignee
Beezi, Llc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beezi, Llc. filed Critical Beezi, Llc.
Priority to AU2001275052A priority Critical patent/AU2001275052A1/en
Publication of WO2001093078A2 publication Critical patent/WO2001093078A2/en
Publication of WO2001093078A3 publication Critical patent/WO2001093078A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • G06F16/9577Optimising the visualization of content, e.g. distillation of HTML documents
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/80Information retrieval; Database structures therefor; File system structures therefor of semi-structured data, e.g. markup language structured data such as SGML, XML or HTML
    • G06F16/84Mapping; Conversion
    • G06F16/88Mark-up to mark-up conversion

Definitions

  • the present invention relates to computer networks and information systems.
  • the present invention pertains to a computer based system and method for mapping a source document to a transformation document.
  • FIG. la depicts a representation of a conventional paradigm for document representation and display.
  • Rendering engine 106 processes source code file 105, which includes co-mingled content and formatting instructions to generate rendered text/graphics 107, which may be displayed on a display device (not shown).
  • Rendering engine may be, for example, a browser.
  • HTML Hypertext Markup Language
  • HTML has been the ubiquitous representation language for representing WWW ("World- Wide- Web") documents.
  • WWW World- Wide- Web
  • the commingling of content and formatting in HTML documents is a significant problem because processors such as search engines or databases would preferably only operate on the content of a document, independently of formatting instructions.
  • the commingling of formatting instructions and content in WWW documents significantly hamper the reusability of the documents. For example, it is desirable to allow documents to be displayed on various types of display devices, in various environments and contexts. However, if formatting information is fused with content, re-formatting documents for display in different environments is extremely problematic.
  • CSS cascading style sheets
  • FIG. 1 b depicts a paradigm of a transformation process in which a data object is transformed into a presentation code object.
  • a data object or file 110 is processed by a source code object 120 to produce presentation code 130.
  • Data object file 110 may be an XML file ("extensible Markup Language")
  • source code object 120 is an XSL (“extensible Style Language”) file
  • presentation code output file 130 is HTML code.
  • the transformation code object rendering engine 106 processes presentation code object 130 to generate rendered text/graphics 107, for example for display on a display device.
  • XML is a technology designed to overcome the deficiencies of HTML, and in particular, the fusion of content and formatting instructions endemic to HTML.
  • HTML has a fixed tag set, while XML allows the definition of any tag sets.
  • XML allows the structuring of content utilizing custom designed structures created by developers and authors, which is completely separated from formatting instructions.
  • XSL is typically used in conjunction with XML to provide a formatting and style language for expression of content structured in XML.
  • An XML document has both a logical and physical structure. Physically, the document is composed of units called entities. An entity may refer to other entities to cause their inclusion in the document. A document begins in a "root" or document entity. Logically, the document is-composed of declarations, elements, comments, character references and processing instructions, all of which are indicated in the document by explicit markup. The logical and physical structures must nest properly.
  • a parsed entity contains text, a sequence of characters, which may represent markup or character data.
  • a character is an atomic unit of text as specified in ISO/IEC 10646.
  • a markup declaration is an element type declaration, an attribute-list declaration, or a notation declaration. Markup takes the form of start-tags, end-tags, empty-element tags, entity references, character references, comments CD ATA section delimiters, document type declarations and processing instructions. All text that is not markup constitutes the character data of the document. Comments may appear anywhere in a document outside other markup. Processing instructions allow documents to contain instructions for applications
  • the function of markup in an XML document is to describe its storage and logical structure and-to associate attribute- value pairs with its logical structure.
  • XML provides a mechanism, the document type declaration ("DTDs") to define constraints on the logical structure and to support the use of predefined storage units.
  • DTD contains or points to markup declarations that provide a grammar for a class of documents.
  • An XML document is valid if it has an associated DTD and if the document complies with the constraints expressed in it.
  • An XML document includes one or more elements, the boundaries of which are either delimited by start-tags and end tags, or for empty elements by empty- element tags.
  • Each element has a type, identified by name, and may have a set of attribute specifications.
  • Each attribute specification has a name and a value.
  • the text between the start-tag and end-tag is referred to as the element's content.
  • the element structure of an XML document may be constrained using element type and attribute-list declarations.
  • An element type declaration constrains the element's content. For example, element type declarations are often used to constrain which element types can appear as children of an element.
  • an XML document In order to be displayed to a user, an XML document must be transformed into a presentation language such as HTML.
  • a transformation expressed in XSL describes rules for transforming a source tree into a result tree. The transformation is achieved by associating patterns with templates. A pattern is matched against elements in a source tree. A template is instantiated to create part of a result tree. The structure of the result tree can be completely different from the structure of the source tree. In constructing the result tree, elements from the source tree can be filtered and reordered, and arbitrary structure can be added. A template is instantiated for a particular source element to create part of the result tree. A template can contain elements that specify literal result element structure.
  • An XSL document includes a set of template rules.
  • a template rule includes two parts: a pattern, which is matched against nodes in the source tree and a template, which can be instantiated to form part of the result tree. This allows a stylesheet to be applicable to a wide class of documents that have similar source tree structures.
  • the template part is also referred to the output actions part.
  • the pattern matching part specifies which elements in the source XML document should use the template to perform a transformation.
  • the output actions part specifies if an element in the source document is matching, what form the selected element should be transformed to in the presentation language file such as HTML (i.e., what to output).
  • a template is instantiated for a particular source element to create part of the result tree.
  • a template can contain elements that specify literal result element structure.
  • each instruction is executed and replaced by the result tree fragment that it creates. Instructions can select and process descendant source elements. Processing a descendant element creates a result tree fragment by finding the applicable template rule and instantiating its template. The result tree is constructed by finding the template rule for the root node and instantiating its template.
  • ⁇ /DIV> ⁇ /xsl:template> means that if the current element is "name,” output ⁇ DIV>(result of transformation of current subtree) ⁇ /DIV>.
  • mapping between one document and another or between multiple views of a document It is very desirable to provide a mapping between one document and another or between multiple views of a document.
  • the paradigm depicted in FIG. lb presents significant challenges for mapping between documents or document views. This problem arises because the data file undergoes a transformation process via the source code object file 120.
  • the transformation document i.e., presentation language file (e.g., HTML)
  • HTML presentation language file
  • mapping between one document view and another is essential in modern information systems. For example, is essential WYSIWYG editors display documents on a display device such as a CRT ("Cathode Ray Tube") as they would appear when rendered in hard-copy format.
  • An objective of the present invention is to provide a method and system for mapping a transformation document to a data file.
  • the present invention provides a method and system for mapping between a data object and one or more transformation objects.
  • the transformation object is generated from the data object as a function of a source code object (e.g., a set of transformation rules).
  • This transformation code object may be, for example, a presentation language object which provides input to a display system such as a WWW browser.
  • each element in a data object is assigned a unique identifier.
  • transformation code relating to a particular data object element is marked with the identifier corresponding to the respective data object element.
  • this is accomplished by modifying the source code object and transformation rules.
  • the transformation rules may be modified to generate an "invisible" mark (i.e., a metatag) in the transformation object data object.
  • the invisible mark does not connote any substantive meaning to any subsequent processors such as rendering engines.
  • the present invention is applied to an XSL transformation process for transforming an XML document to an HTML document.
  • the present invention provides a method and system for linking each of a plurality of XML elements included in an XML document to a respective node in an HTML tree that has been generated using an associated XSL file.
  • the linkage between XML elements and HTML nodes provides a powerful mechanism to allow WYSWG editing of the XML document, which is presented as a function of an HTML file generated as a function of the XML document and an associated XSL document.
  • a dynamic WYSWG editing system includes a processor, display device, one or more input devices and a storage subsystem for storing files.
  • the WYSWG system performs an initialization process to prepare for WYSWG editing of an XML file.
  • the processor retrieves a desired XML file for editing, an associated XSL file.
  • the processor then performs a transformation step to produce a second XSL file, referred to herein as XSL'.
  • the transformation step is designed to generate a unique ID for each element in the XML document.
  • the transformation step further is designed to generate a wrapper tag in the rendered HTML file for each XML element such that the wrapper tag includes the unique ID corresponding to the associated XML element.
  • the wrapper tag permits association of a particular HTML node with a corresponding XML element. This association-permits dynamic editing of an XML document, which is presented to a user via a presentation language such as HTML.
  • the processor generates the XSL' file changing all the output statement of any templates in the XSL file to add wrapper tags to transformation output.
  • FIG. la depicts a paradigm for conventional data structures for providing a WYSIWYG editing system.
  • FIG. lb depicts a paradigm of a transformation process according to one embodiment of the present invention.
  • FIG. 2 depicts a system for providing WYSIWYG editing of a data object according to one embodiment of the present invention.
  • FIG. 3a depicts a simplified physical structure of an XML file according to one embodiment of the present invention.
  • FIG. 3b depicts a simplified structure of an XSL file according to one embodiment of the present invention.
  • FIG. 4a is a flowchart depicting a set of steps of an initialization process executed by a dynamic WYSIWYG editing system according to one embodiment of the present invention.
  • FIG. 4b is a flowchart of a process for generating a modified XSL file (XSL') from a source XSL file according to one embodiment of the present invention.
  • XSL' modified XSL file
  • FIG. 5a is a flowchart that depicts a main process for providing WYSWG editing of an XML file according to one embodiment of the present invention. The process begins in step 510.
  • FIG. 5b is a flowchart of a set of steps of an input event handler according to one embodiment of the present invention.
  • FIG. 5c is a flowchart of a process to return an encapsulating HTML node according to one embodiment of the present invention.
  • the present invention provides a method and system for mapping between data object (e.g., XML file) and one or more transformation objects (e.g., HTML file) generated as a function of a set-of transformation rules (e.g., XSL).
  • data object e.g., XML file
  • transformation objects e.g., HTML file
  • set-of transformation rules e.g., XSL.
  • XSL set-of transformation rules
  • a mapping structure between a data document and a transformation document may be utilized to provide WYSIWYG editing of a data object (e.g., XML file) transformed to a presentation object (e.g., HTML file) using a set of transformation rules (e.g., XSL).
  • a data object e.g., XML file
  • a presentation object e.g., HTML file
  • transformation rules e.g., XSL
  • the present invention provides a mechanism to provide WYSIWG editing of a data file that has undergone one or more transformations via a set of transformation rules such as those specified in a source code.
  • the present invention could be applied to provide WYSIWG editing for word processing, code development, etc.
  • the present invention provides a dynamic
  • each XML is associated with a unique ID.
  • a modified XSL file XSL' is generated.
  • the output portion of each template in the XSL file is modified to include a wrapper.
  • the wrapper includes a function executed at run-time that generates a unique ID for each XML element processed.
  • the transformed result of any XML element is linked to a particular source element in the XML document.
  • FIG. 2 depicts a system for providing WYSIWYG editing of a data object according to one embodiment of the present invention.
  • the system includes processor 230, display device 210, input devices 240a and 240b.
  • input devices may be a keyboard and/or mouse.
  • Display device 210 and input devices 240a-b provides a GUI, which allows editing of files, which are presented as WYSWG document 220 on display device 210.
  • Processor 230 is also coupled to storage device 235, which may include a hard disk storage unit or a volatile memory such as a RAM ("Random Access Memory").
  • the WYSWG system depicted in provides for editing of an XML document.
  • storage device 235 includes XSL file 230a, XML file 230c.
  • WYSIWYG system performs transformation of XSL file 230 into XSL' file 230b (described in detail below), which is also stored on storage device 235.
  • WYSWG editing system also performs transformation of XML file 240c into HTML file 240d, which is utilized as a presentation language, which is then rendered on display device 210 by processor 230.
  • FIG. 3 a depicts a simplified physical structure of an XML file according to one embodiment of the present invention.
  • XML file 360 includes a plurality of elements 365(1)-365(N). Each XML element may include a plurality of nested elements. Although not depicted in FIG. 3a, it is assumed that XML file includes an associated element type declaration for each element.
  • FIG. 3b depicts a simplified structure of an XSL file according to one embodiment of the present invention.
  • XSL document 320 includes a plurality of templates 310( 1 )-310(N).
  • Each XSL template 310 includes pattern matching part 301 and output actions part 303.
  • FIG. 3b shows pattern matching parts 301(1)-301(N) and output actions parts 303(1)-303(N) for each respective template 310(1)-310(N).
  • FIG. 4a is a flowchart depicting a set of steps of an initialization process executed by a dynamic WYSIWYG editing system according to one embodiment of the present invention. The process is initiated in step 410.
  • step 420 an XML data file to be edit is retrieved.
  • XML data file may be stored on storage subsystem 235.
  • step 430 it is determined whether an associated XSL file exists. If so ('yes' branch of step 430), in step 432, the associated XSL file is retrieved. If not ('no' branch of step 430), in step 435 a default XSL file is retrieved.
  • processor 230 processes the XSL file (default or associated) to generate a modified XSL file, referred to herein as XSL'.
  • the XSL' file is generated as a function of the XML data file be edited and the associated XSL file such that each of the transformation rules in each template of the XSL file is modified to generates a transformed result that associates a unique identifier of an element in the XML file with the transformed result.
  • the association between a unique identifier for each XML element and a transformed result is effected by modifying the XSL file to generate a wrapper utilizing a tag structure that encapsulates the transformed result of each XML element and includes the unique XML element ID.
  • the steps for production of XSL' file is described in detail below with respect to FIG. 4b.
  • a tree data structure is generated for the original XML file, which represents the parent/child relationship between all elements in the XML file.
  • a GUI event loop is initiated to begin editing of the XML file.
  • FIG. 4b is a flowchart of a process for generating a modified XSL file (XSL') from a source XSL file according to one embodiment of the present invention.
  • This process is executed by processor 230 as part of an initialization process (i.e., step 440 in FIG. 4a).
  • the process is initiated in step 451.
  • step 455 it is determined whether all templates in the original XSL file have been analyzed. If so ('yes' branch of step 455), the process ends in step 467. Otherwise ('no' branch of step 455), in step 457 an ID wrapper is generated.
  • the ID wrapper is a ⁇ SPAN> tag.
  • the output action portion of the template is wrapped with the invisible element and ID information.
  • the output action portion of the template is modified to include a function called at runtime that generates a unique ID for each element that is matched to the template. For example, consider the following XML element
  • the following modified template would be generated in the XSL' file: ⁇ xsl template match ⁇ "name"> - ⁇ DIV>
  • step 463 it is determined whether default templates were overridden in the original XSL file. If so ('yes' branch of step 463), the process ends in step 467. Otherwise ('no' branch of step 463), in step 465, default templates are overridden. The process ends in step 467.
  • FIG. 5a is a flowchart that depicts a main process for providing WYSWG editing of an XML file according to one embodiment of the present invention.
  • the process begins in step 510.
  • step 520 it is determined whether the user has selected to display the XML source using the original XSL file. If so ('yes' branch of step 520), in step 530, HTML code is generated as a function of the XML file and a file XSL_SOURCEVIEW, which includes formatting instructions for displaying a source view of the XML code.
  • step 535 it is determined whether the user has selected a WYSWG view of the file for editing. If so ('yes' branch of step 535), in step 537, HTML code is generated as a function of the XML file and the XSL' file. This
  • HTML code is then provided to a rendering engine such as a browser for display to the user.
  • a rendering engine such as a browser for display to the user.
  • the user may perform dynamic editing of the XML code.
  • HTML code is generated as a function of the XML file (which may be modified at this point) and the XSL file.
  • an edit event i.e., an event to edit the XML file.
  • an edit event handler is called in step 549. If not ('no' branch of step 547), flow continues with step 520.
  • FIG. 5b is a flowchart of a set of steps of an input event handler according to one embodiment of the present invention.
  • the event handler is called upon receipt of an input event such as a mouse click or keyboard input event (i.e., step 547 of FIG. 5a).
  • the process is initiated in step 551.
  • a handle or identifier of the HTML element selected by the user is returned using known methods.
  • the DOM is employed, which provides an API call to return an identifier of an object such as an HTML node when clicked by the user.
  • a handle or unique identifier of an HTML encapsulating node is retrieved.
  • the HTML encapsulating node is an HTML node that encapsulates the current node and is associated with an element ID in the XML document.
  • a process for returning the encapsulating node is described below with reference to FIG. 5c.
  • the corresponding XML element ID is retrieved.
  • the XML element ID is established by setting an attribute of the encapsulating element to reference the XML node.
  • the encapsulating element may be a SPAN element in the HTML code or an element that encapsulates other nodes.
  • the XML node is returned, preferably as a pointer to the WYSIWYG editing system, which may be used to map edits made back to the original XML document. The process ends in step 567.
  • FIG. 5c is a flowchart of a process to return an encapsulating HTML node according to one embodiment of the present invention.
  • the process operates by traversing the HTML tree beginning with the current HTML node upwards (i.e., ascending the tree parent by parent) until an encapsulating HTML node is found.
  • the process is initiated in step 571.
  • step 575 it is determined whether the current
  • HTML node is an encapsulating node.
  • An HTML node is encapsulating if it has an attribute that references an XML node. If the HTML node is not encapsulating ('no' branch of step 575), in step 583, the parent node of the current node is set to the current node. Flow then continues with step 575 until the traversal of the HTML tree locates an encapsulating node ('yes' branch of 575). The process ends in step 581, with return of the HTML encapsulating node. According to one embodiment In step 559, the XML corresponding to the clicked HTML node is retrieved. _

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer And Data Communications (AREA)
  • Document Processing Apparatus (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides a method and system for mapping between a data object and at least one transformation object. According to one embodiment, a transformation object is generated from the data object as a function of a source code object (e.g., a set of transformation rules). A transformation object may be, for example, a presentation language object which provides input to a display system such as a WWW browser.

Description

METHOD AND SYSTEM
FOR MAPPING BETWEEN A SOURCE DOCUMENT
AND A TRANSFORMATION DOCUMENT FIELD OF THE INVENTION
The present invention relates to computer networks and information systems. In particular, the present invention pertains to a computer based system and method for mapping a source document to a transformation document.
BACKGROUND INFORMATION
Computer data is often organized into entities called documents, files produced by an application such as a word processor. Modern information systems and workstations provide a GUI ("Graphical User Interface") for the manipulation and viewing of documents. In conventional paradigms for the representation of documents, content and formatting instructions included in a single document. FIG. la depicts a representation of a conventional paradigm for document representation and display. Rendering engine 106 processes source code file 105, which includes co-mingled content and formatting instructions to generate rendered text/graphics 107, which may be displayed on a display device (not shown). Rendering engine may be, for example, a browser.
HTML ("Hypertext Markup Language") is an example of a tagging language in which content and formatting are largely commingled. HTML has been the ubiquitous representation language for representing WWW ("World- Wide- Web") documents. Known technologies exist for WYSIWYG editing of HTML documents so that changes made via z GUI automatically result in appropriate changes in the underlying HTML code. The commingling of content and formatting in HTML documents is a significant problem because processors such as search engines or databases would preferably only operate on the content of a document, independently of formatting instructions. Moreover, the commingling of formatting instructions and content in WWW documents significantly hamper the reusability of the documents. For example, it is desirable to allow documents to be displayed on various types of display devices, in various environments and contexts. However, if formatting information is fused with content, re-formatting documents for display in different environments is extremely problematic.
There have been various initiatives to separate format from content using HTML. For example, cascading style sheets ("CSS") provide a simple mechanism for adding style (e.g., fonts, colors, spacing) to WWW documents. However, HTML and CSS present inherent limitations for the separation of content and formatting instructions.
Recent developments in document paradigms have emphasized separation of content from style. Using this paradigm, a source document includes only content, which may be presented in any number of formats depending upon an intended audience. Typically, a set of transformation rules are applied to the source document to generate a transformation document for presentation to a user. For example, FIG. 1 b depicts a paradigm of a transformation process in which a data object is transformed into a presentation code object. As shown in FIG. lb, a data object or file 110 is processed by a source code object 120 to produce presentation code 130. Data object file 110 may be an XML file ("extensible Markup Language"), source code object 120 is an XSL ("extensible Style Language") file and presentation code output file 130 is HTML code.
Thus, the transformation code object rendering engine 106 processes presentation code object 130 to generate rendered text/graphics 107, for example for display on a display device.
In the context of editing/presentation, XML is a technology designed to overcome the deficiencies of HTML, and in particular, the fusion of content and formatting instructions endemic to HTML. A major difference between XML and HTML is that XML has a fixed tag set, while XML allows the definition of any tag sets. XML allows the structuring of content utilizing custom designed structures created by developers and authors, which is completely separated from formatting instructions. XSL is typically used in conjunction with XML to provide a formatting and style language for expression of content structured in XML. An XML document has both a logical and physical structure. Physically, the document is composed of units called entities. An entity may refer to other entities to cause their inclusion in the document. A document begins in a "root" or document entity. Logically, the document is-composed of declarations, elements, comments, character references and processing instructions, all of which are indicated in the document by explicit markup. The logical and physical structures must nest properly.
A parsed entity contains text, a sequence of characters, which may represent markup or character data. A character is an atomic unit of text as specified in ISO/IEC 10646. A markup declaration is an element type declaration, an attribute-list declaration, or a notation declaration. Markup takes the form of start-tags, end-tags, empty-element tags, entity references, character references, comments CD ATA section delimiters, document type declarations and processing instructions. All text that is not markup constitutes the character data of the document. Comments may appear anywhere in a document outside other markup. Processing instructions allow documents to contain instructions for applications
The function of markup in an XML document is to describe its storage and logical structure and-to associate attribute- value pairs with its logical structure. XML provides a mechanism, the document type declaration ("DTDs") to define constraints on the logical structure and to support the use of predefined storage units. A DTD contains or points to markup declarations that provide a grammar for a class of documents. An XML document is valid if it has an associated DTD and if the document complies with the constraints expressed in it.
An XML document includes one or more elements, the boundaries of which are either delimited by start-tags and end tags, or for empty elements by empty- element tags. Each element has a type, identified by name, and may have a set of attribute specifications. Each attribute specification has a name and a value. The text between the start-tag and end-tag is referred to as the element's content.
The element structure of an XML document may be constrained using element type and attribute-list declarations. An element type declaration constrains the element's content. For example, element type declarations are often used to constrain which element types can appear as children of an element.
In order to be displayed to a user, an XML document must be transformed into a presentation language such as HTML. A transformation expressed in XSL describes rules for transforming a source tree into a result tree. The transformation is achieved by associating patterns with templates. A pattern is matched against elements in a source tree. A template is instantiated to create part of a result tree. The structure of the result tree can be completely different from the structure of the source tree. In constructing the result tree, elements from the source tree can be filtered and reordered, and arbitrary structure can be added. A template is instantiated for a particular source element to create part of the result tree. A template can contain elements that specify literal result element structure. When a template is instantiated, each instruction is executed and replaced by the result tree fragment that it creates. An XSL document includes a set of template rules. A template rule includes two parts: a pattern, which is matched against nodes in the source tree and a template, which can be instantiated to form part of the result tree. This allows a stylesheet to be applicable to a wide class of documents that have similar source tree structures. The template part is also referred to the output actions part. The pattern matching part specifies which elements in the source XML document should use the template to perform a transformation. The output actions part specifies if an element in the source document is matching, what form the selected element should be transformed to in the presentation language file such as HTML (i.e., what to output). A template is instantiated for a particular source element to create part of the result tree. A template can contain elements that specify literal result element structure. When a template is instantiated, each instruction is executed and replaced by the result tree fragment that it creates. Instructions can select and process descendant source elements. Processing a descendant element creates a result tree fragment by finding the applicable template rule and instantiating its template. The result tree is constructed by finding the template rule for the root node and instantiating its template.
For example, an XSL template may be specified: <xsl:template match="name"> <DIV>
<xsl:apply-templates/> //Recursively transform a subtree with current node as root
</DIV> </xsl:template> means that if the current element is "name," output <DIV>(result of transformation of current subtree)</DIV>.
Thus, given an XML fragment: <XML> <name>abcdefg</name>
</XML> using the above XSL template, is transformed to
<DIV>abcdefg</DIV>
It is very desirable to provide a mapping between one document and another or between multiple views of a document. However, the paradigm depicted in FIG. lb presents significant challenges for mapping between documents or document views. This problem arises because the data file undergoes a transformation process via the source code object file 120. Thus, the transformation document (i.e., presentation language file (e.g., HTML)) is unlinked from the original data file (e.g., XML file). However, mapping between one document view and another is essential in modern information systems. For example, is essential WYSIWYG editors display documents on a display device such as a CRT ("Cathode Ray Tube") as they would appear when rendered in hard-copy format. Known methods exist for WYSIWYG editing of a document where content and formatting instructions are included in one file (fig. la). In this case, changes entered in a WYSIWYG view may be mapped to an underlying source document. For example, modern GUI paradigms such as DOM ("Document Object Model") maintain unique identifiers for each object in a document in a tree structure. An API ("Application Program Interface") provides function calls to return a particular identifier when a user interacts with a rendered object, for example by clicking on a rendered object displayed on a display device. The DOM defines what attributes are associated with each object has and how the objects and attributes may be manipulated.
An objective of the present invention is to provide a method and system for mapping a transformation document to a data file.
SUMMARY OF THE INVENTION
The present invention provides a method and system for mapping between a data object and one or more transformation objects. According to one embodiment, the transformation object is generated from the data object as a function of a source code object (e.g., a set of transformation rules). This transformation code object may be, for example, a presentation language object which provides input to a display system such as a WWW browser. According to one embodiment, each element in a data object is assigned a unique identifier. During the transformation process, transformation code relating to a particular data object element is marked with the identifier corresponding to the respective data object element. According to one embodiment, this is accomplished by modifying the source code object and transformation rules. For example, the transformation rules may be modified to generate an "invisible" mark (i.e., a metatag) in the transformation object data object. The invisible mark does not connote any substantive meaning to any subsequent processors such as rendering engines.
According to one embodiment, the present invention is applied to an XSL transformation process for transforming an XML document to an HTML document. In particular, the present invention provides a method and system for linking each of a plurality of XML elements included in an XML document to a respective node in an HTML tree that has been generated using an associated XSL file. The linkage between XML elements and HTML nodes provides a powerful mechanism to allow WYSWG editing of the XML document, which is presented as a function of an HTML file generated as a function of the XML document and an associated XSL document.
According to one embodiment, a dynamic WYSWG editing system includes a processor, display device, one or more input devices and a storage subsystem for storing files. The WYSWG system performs an initialization process to prepare for WYSWG editing of an XML file. During the initialization process, the processor retrieves a desired XML file for editing, an associated XSL file. The processor then performs a transformation step to produce a second XSL file, referred to herein as XSL'. The transformation step is designed to generate a unique ID for each element in the XML document. The transformation step further is designed to generate a wrapper tag in the rendered HTML file for each XML element such that the wrapper tag includes the unique ID corresponding to the associated XML element. The wrapper tag permits association of a particular HTML node with a corresponding XML element. This association-permits dynamic editing of an XML document, which is presented to a user via a presentation language such as HTML. According to one embodiment, the processor generates the XSL' file changing all the output statement of any templates in the XSL file to add wrapper tags to transformation output.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. la depicts a paradigm for conventional data structures for providing a WYSIWYG editing system.
FIG. lb depicts a paradigm of a transformation process according to one embodiment of the present invention.
FIG. 2 depicts a system for providing WYSIWYG editing of a data object according to one embodiment of the present invention.
FIG. 3a depicts a simplified physical structure of an XML file according to one embodiment of the present invention. FIG. 3b depicts a simplified structure of an XSL file according to one embodiment of the present invention.
FIG. 4a is a flowchart depicting a set of steps of an initialization process executed by a dynamic WYSIWYG editing system according to one embodiment of the present invention. FIG. 4b is a flowchart of a process for generating a modified XSL file (XSL') from a source XSL file according to one embodiment of the present invention.
FIG. 5a is a flowchart that depicts a main process for providing WYSWG editing of an XML file according to one embodiment of the present invention. The process begins in step 510. FIG. 5b is a flowchart of a set of steps of an input event handler according to one embodiment of the present invention.
FIG. 5c is a flowchart of a process to return an encapsulating HTML node according to one embodiment of the present invention.
DETAILED DESCRIPTION
The present invention provides a method and system for mapping between data object (e.g., XML file) and one or more transformation objects (e.g., HTML file) generated as a function of a set-of transformation rules (e.g., XSL). Although, the embodiment described herein relates to a WYSIWYG editing system, the mapping structure may be utilized to provide a user powerful capacity for manipulation and editing of a data object. For example, the present invention may be applied in an environment where it is desired to provide different users different views of a document.
According to one embodiment, a mapping structure between a data document and a transformation document provided by the present invention may be utilized to provide WYSIWYG editing of a data object (e.g., XML file) transformed to a presentation object (e.g., HTML file) using a set of transformation rules (e.g., XSL). Thus, according to one embodiment, the present invention provides a mechanism to provide WYSIWG editing of a data file that has undergone one or more transformations via a set of transformation rules such as those specified in a source code. Thus, for example, the present invention could be applied to provide WYSIWG editing for word processing, code development, etc. According to one embodiment, the present invention provides a dynamic
WYSWG editing system for XML documents or other documents that undergo a transformation process as a function of source code such as XSL in order to render a presentation language file such as HTML. According to one embodiment, each XML is associated with a unique ID. A modified XSL file XSL' is generated. In particular, the output portion of each template in the XSL file is modified to include a wrapper. The wrapper includes a function executed at run-time that generates a unique ID for each XML element processed. Thus, the transformed result of any XML element is linked to a particular source element in the XML document.
FIG. 2 depicts a system for providing WYSIWYG editing of a data object according to one embodiment of the present invention. As shown in FIG. 2, the system includes processor 230, display device 210, input devices 240a and 240b. As depicted in FIG. 2, input devices may be a keyboard and/or mouse. Display device 210 and input devices 240a-b provides a GUI, which allows editing of files, which are presented as WYSWG document 220 on display device 210. Processor 230 is also coupled to storage device 235, which may include a hard disk storage unit or a volatile memory such as a RAM ("Random Access Memory"). According to one embodiment, the WYSWG system depicted in provides for editing of an XML document. As shown in FIG. 2, storage device 235 includes XSL file 230a, XML file 230c.
In order to provide WYSIWYG editing, WYSIWYG system performs transformation of XSL file 230 into XSL' file 230b (described in detail below), which is also stored on storage device 235. WYSWG editing system also performs transformation of XML file 240c into HTML file 240d, which is utilized as a presentation language, which is then rendered on display device 210 by processor 230.
FIG. 3 a depicts a simplified physical structure of an XML file according to one embodiment of the present invention. XML file 360 includes a plurality of elements 365(1)-365(N). Each XML element may include a plurality of nested elements. Although not depicted in FIG. 3a, it is assumed that XML file includes an associated element type declaration for each element.
FIG. 3b depicts a simplified structure of an XSL file according to one embodiment of the present invention. As shown in FIG. 3b, XSL document 320 includes a plurality of templates 310( 1 )-310(N). Each XSL template 310 includes pattern matching part 301 and output actions part 303. Thus, FIG. 3b shows pattern matching parts 301(1)-301(N) and output actions parts 303(1)-303(N) for each respective template 310(1)-310(N). FIG. 4a is a flowchart depicting a set of steps of an initialization process executed by a dynamic WYSIWYG editing system according to one embodiment of the present invention. The process is initiated in step 410. In step 420, an XML data file to be edit is retrieved. For example, referring to FIG. 2, XML data file may be stored on storage subsystem 235. In step 430, it is determined whether an associated XSL file exists. If so ('yes' branch of step 430), in step 432, the associated XSL file is retrieved. If not ('no' branch of step 430), in step 435 a default XSL file is retrieved. In step 440, processor 230 processes the XSL file (default or associated) to generate a modified XSL file, referred to herein as XSL'. The XSL' file is generated as a function of the XML data file be edited and the associated XSL file such that each of the transformation rules in each template of the XSL file is modified to generates a transformed result that associates a unique identifier of an element in the XML file with the transformed result. According to one embodiment, the association between a unique identifier for each XML element and a transformed result is effected by modifying the XSL file to generate a wrapper utilizing a tag structure that encapsulates the transformed result of each XML element and includes the unique XML element ID. The steps for production of XSL' file is described in detail below with respect to FIG. 4b. In step 445, a tree data structure is generated for the original XML file, which represents the parent/child relationship between all elements in the XML file. In step 447, a GUI event loop is initiated to begin editing of the XML file.
FIG. 4b is a flowchart of a process for generating a modified XSL file (XSL') from a source XSL file according to one embodiment of the present invention. This process is executed by processor 230 as part of an initialization process (i.e., step 440 in FIG. 4a). The process is initiated in step 451. In step 455, it is determined whether all templates in the original XSL file have been analyzed. If so ('yes' branch of step 455), the process ends in step 467. Otherwise ('no' branch of step 455), in step 457 an ID wrapper is generated. According to one embodiment of the present invention, the ID wrapper is a <SPAN> tag. In step 459, the output action portion of the template is wrapped with the invisible element and ID information. According to one embodiment, the output action portion of the template is modified to include a function called at runtime that generates a unique ID for each element that is matched to the template. For example, consider the following XML element
<XML>. . . <name>abcdefg</name> with an associated XSL template:
<xsl: template match- 'name">
<DIV>
<xsl:apply-templates/> </DIV>
</xsl:template> which would be transformed to the following result using the original XSL file:
<DIV>abcdefg</DIV>
According to one embodiment of the present invention, the following modified template would be generated in the XSL' file: <xsl template match^"name"> - <DIV>
<SPAN attribute=(id info, of "name" element)> <xsl:apply-templates/> </SPAN> </DIV>
<xsl:template> The modified XSL' template would then transform the XML element as follows:
<SPAN attribute=(id info, of "name element)> <DIV>abide</DIV>
</SPAN>
In step 463, it is determined whether default templates were overridden in the original XSL file. If so ('yes' branch of step 463), the process ends in step 467. Otherwise ('no' branch of step 463), in step 465, default templates are overridden. The process ends in step 467.
FIG. 5a is a flowchart that depicts a main process for providing WYSWG editing of an XML file according to one embodiment of the present invention. The process begins in step 510. In step 520, it is determined whether the user has selected to display the XML source using the original XSL file. If so ('yes' branch of step 520), in step 530, HTML code is generated as a function of the XML file and a file XSL_SOURCEVIEW, which includes formatting instructions for displaying a source view of the XML code. In step 535, it is determined whether the user has selected a WYSWG view of the file for editing. If so ('yes' branch of step 535), in step 537, HTML code is generated as a function of the XML file and the XSL' file. This
HTML code is then provided to a rendering engine such as a browser for display to the user. In this WYSWG view, the user may perform dynamic editing of the XML code. In step 540, it is determined whether the user has selected a preview view. If so, HTML code is generated as a function of the XML file (which may be modified at this point) and the XSL file. In step 547, it is determined whether an edit event has occurred (i.e., an event to edit the XML file). For example, an edit event may correspond to the user clicking on the a particular portion of the screen with the mouse, or providing other selection for editing using a keyboard. If an edit event has occurred ('yes' branch of step 547), an edit event handler is called in step 549. If not ('no' branch of step 547), flow continues with step 520.
FIG. 5b is a flowchart of a set of steps of an input event handler according to one embodiment of the present invention. The event handler is called upon receipt of an input event such as a mouse click or keyboard input event (i.e., step 547 of FIG. 5a). The process is initiated in step 551. In step 553, a handle or identifier of the HTML element selected by the user is returned using known methods. For example, according to one embodiment the DOM is employed, which provides an API call to return an identifier of an object such as an HTML node when clicked by the user. In step 555, a handle or unique identifier of an HTML encapsulating node is retrieved. The HTML encapsulating node is an HTML node that encapsulates the current node and is associated with an element ID in the XML document. A process for returning the encapsulating node is described below with reference to FIG. 5c. In step 559, the corresponding XML element ID is retrieved. According to one embodiment, the XML element ID is established by setting an attribute of the encapsulating element to reference the XML node. The encapsulating element may be a SPAN element in the HTML code or an element that encapsulates other nodes. In step 563, the XML node is returned, preferably as a pointer to the WYSIWYG editing system, which may be used to map edits made back to the original XML document. The process ends in step 567.
FIG. 5c is a flowchart of a process to return an encapsulating HTML node according to one embodiment of the present invention. The process operates by traversing the HTML tree beginning with the current HTML node upwards (i.e., ascending the tree parent by parent) until an encapsulating HTML node is found. The process is initiated in step 571. In step 575, it is determined whether the current
HTML node is an encapsulating node. An HTML node is encapsulating if it has an attribute that references an XML node. If the HTML node is not encapsulating ('no' branch of step 575), in step 583, the parent node of the current node is set to the current node. Flow then continues with step 575 until the traversal of the HTML tree locates an encapsulating node ('yes' branch of 575). The process ends in step 581, with return of the HTML encapsulating node. According to one embodiment In step 559, the XML corresponding to the clicked HTML node is retrieved. _

Claims

What is Claimed Is:
1. A method for mapping between a data document to at least one transformation document comprising the steps of:
(a) assigning an element identifier to each of a plurality of elements in the data document; and,
(b) generating a transformation document from the data document, the transformation document including a plurality of transformation elements, wherein the transformation document associates each transformation element with an element identifier in the data document.
2. The method according to claim 1, wherein step (b) further includes the step of: (i) modifying at least one transformation rule to associate an element identifier with a transformation object, wherein the transformation object is an output of a transformation rule.
3. The method according to claim 1, wherein the data document is an XML document.
4. The method according to claim 1 , wherein the transformation document is an HTML document.
5. The method according to claim 2, wherein the transformation rule is included in an XSL document.
6. A system for mapping between a data document to at least one transformation document comprising:
(a) a processor, wherein the processor is adapted to:
(c) assign an element identifier to each of a plurality of elements in the data document; and,
(d) generate a transformation document from the data document, the transformation document including a plurality of transformation elements, wherein the transformation document associates each transformation element with an element identifier in the data document.
7. A method for WYSIWG editing of a data file, wherein the data file is associated with a source code file for transformation of the data file into a presentation language file comprising the steps of:
(a) assigning a unique identification information to each of a plurality of elements included in the data file;
(b) generating a second source code file, as a function of the first source code file and the data file, wherein the second source code file includes code to associate an identifier assigned to an element uniquely identify each element an output of a transformation output for each element in the data file.
8. A method for WYSWG editing of a data file, wherein the data file includes a plurality of elements, the data file being associated with a source code file for transformation of the data file into a presentation language file, wherein the source code file includes a plurality of transformation rules, each of the transform rules including a pattern matching part for matching an element in the data file and an output actions part for generating a transformed result for an element in the data file, comprising the steps of:
(a) assigning a unique identifier to each of the elements in the data file;
(b) modifying each of the transformation rules, such that each of the transformation rules generates a transformed result that associates a unique identifier of an element with a transformed result.
PCT/US2001/017544 2000-05-31 2001-05-31 Method and system for mapping between a source document and a transformation document WO2001093078A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2001275052A AU2001275052A1 (en) 2000-05-31 2001-05-31 Method and system for mapping between a source document and a transformation document

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US58369200A 2000-05-31 2000-05-31
US09/583,692 2000-05-31

Publications (2)

Publication Number Publication Date
WO2001093078A2 true WO2001093078A2 (en) 2001-12-06
WO2001093078A3 WO2001093078A3 (en) 2003-12-24

Family

ID=24334176

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2001/017544 WO2001093078A2 (en) 2000-05-31 2001-05-31 Method and system for mapping between a source document and a transformation document

Country Status (2)

Country Link
AU (1) AU2001275052A1 (en)
WO (1) WO2001093078A2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006040237A1 (en) * 2004-10-13 2006-04-20 Siemens Aktiengesellschaft Method for converting data

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7685252B1 (en) * 1999-10-12 2010-03-23 International Business Machines Corporation Methods and systems for multi-modal browsing and implementation of a conversational markup language

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006040237A1 (en) * 2004-10-13 2006-04-20 Siemens Aktiengesellschaft Method for converting data

Also Published As

Publication number Publication date
WO2001093078A3 (en) 2003-12-24
AU2001275052A1 (en) 2001-12-11

Similar Documents

Publication Publication Date Title
Tidwell XSLT: mastering XML transformations
US6931590B2 (en) Method and system for managing documents
US7080083B2 (en) Extensible stylesheet designs in visual graphic environments
US7143344B2 (en) Transformation stylesheet editor
US8484553B2 (en) System and method for defining specifications for outputting content in multiple formats
US7178101B2 (en) Content template system
US7197510B2 (en) Method, system and program for generating structure pattern candidates
US7496828B2 (en) Method and system for mapping tags to classes using namespaces
US20020147748A1 (en) Extensible stylesheet designs using meta-tag information
US20100174981A1 (en) Rtf template and xsl/fo conversion: a new way to create computer reports
US20030135825A1 (en) Dynamically generated mark-up based graphical user interfaced with an extensible application framework with links to enterprise resources
US20020169803A1 (en) System and user interface for generating structured documents
US20060143562A1 (en) Self-describing editors for browser-based WYSIWYG XML/HTML editors
US20030007014A1 (en) User interface system for composing an image page layout
US7131066B1 (en) XML based system for updating a domain model and generating a formatted output
KR101292982B1 (en) Declarative mechanism for defining a hierarchy of objects
US20090083300A1 (en) Document processing device and document processing method
US20080005662A1 (en) Server Device and Name Space Issuing Method
Koyanagi et al. Demonstrational interface for XSLT stylesheet generation.
WO2001093078A2 (en) Method and system for mapping between a source document and a transformation document
Kuo et al. Generating form-based user interfaces for XML vocabularies
WO2009004386A2 (en) Representation of multiple markup language files in one file for the production of new markup language files
US20080005085A1 (en) Server Device and Search Method
WO2002082326A2 (en) Extensible stylesheet designs using meta-tag information
US20090083620A1 (en) Document processing device and document processing method

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载