s.im.pl meta-metadata:
tutorial [data definition]
For the tutorials we will create a meta_metadata object for the site UrbanSpoon.

There are many different types of information pages on UrbanSpoon. When writing meta-metadata for a site it's best practice to start with the lowest level of information page and then work up. For that reason this object will be specifically for a UrbanSpoon restaurant information page.

The first step is to create a new XML file for your meta-metadata definitions and place it in the mmdrepository/repositorySources package in the MetaMetadataRepository project. Alternatively, you can add your definitions to an existing file in that folder if one already exists for your information source. You will need to define a meta_metadata_repository tag as the root element of the new XML file.

NOTE: You may need to delete mmdrepository/powerUser/urbanSpoon.xml and mmdrepository/repositorySources/restaurant.xml to prevent some conflicts. After getting through the tutorials, you can use SVN to revert local changes to get them back.

The tag must have the following attributes:
<meta_metadata_repository name="urban_spoon" package="ecologylab.semantics.generated.library.tutorial.urbanspoon" >

Next, you will need to create a meta_metadata object. All other tags for this object will be nested inside this. The tag must have the following attributes: Besides these three required attributes, the two following attributes will also be used: So the meta_metadata tag for this object looks like this (where compound_document is a built-in type):
<meta_metadata name="restaurant" extends="compound_document" comment="The restaurant class" >

Here we are defining a generic meta_metadata type for all kinds of restaurants, while urbanspoon.com is one of many information sources that provide data of this type.

Because this is the generic restaurant class the parser attribute is excluded -- different sources may use different extraction methods. It will be added later when we create a meta_metadata object specific to UrbanSpoon.
Now that the object is defined, we will create some information fields. These fields will be things like the name of the restaurant, its user rating, an image of the restaurant, etc. Fields are defined using one of three tags:

The attributes needed for each field type are described below:

for scalar:

for composite:

for collection:

To decide which information fields we want to gather let's take a look at the restaurant page and see what is available and what we think could be useful. (As a rule it is always better to get the information even if you are not sure if it will be needed.)


I have boxed in blue the information which I think would be good to have: The complete meta_metadata object will be:
<meta_metadata_repository name="urban_spoon" package="ecologylab.semantics.generated.library.tutorial.urbanspoon" >

<meta_metadata name="restaurant" extends="compound_document" comment="The restaurant class" >

<scalar name="phone" scalar_type="String" comment="Phone number of the restaurant" />

<scalar name="pic" scalar_type="ParsedURL" hide="true" comment="A picture from the restaurant" />

<scalar name="link" scalar_type="ParsedURL" comment="Link to the restaurant's website" />

<scalar name="rating" scalar_type="String" comment="Rating of the restaurant" />

<scalar name="price_range" scalar_type="String" comment="Price range of the restaurant" />

<scalar name="map" scalar_type="ParsedURL" hide="true" comment="Map image of the restaurant's location or link to a directions page" />

<collection name="genres" child_type="document" generate_class="false" comment="The genres of food offered" />


Some key things to observe from this class:
Note: To use your newly defined meta_metadata in an application, you may need to compile it into source codes with MetaMetadata Compiler. The compiler is in the package ecologylab.semantics.compiler in project simplTranslators (see environment setup).

With Eclipse, launch files named as CompileMmdTo*.launch (where * is your target language name) are available in that project for you to run the compiler conveniently.
Hopefully now you have learned how a data structure is defined for a meta_metadata object.

The next tutorial will cover information extraction.