This is a follow-up to my question with the subject line “Seeking Guidance on using the XML documentation for my sample.” I originally posted it there, but I noticed that it had been marked “closed”, so I was not sure anybody had seen my follow-up question.
Is the primary function of the XSD (in this case) to be a standard for constructing and checking a well-formed XML document which is itself essentially self-documenting? Or is it instructions to the application used to view or extract data from the XML document, such that the document is incomplete or can not be interpreted without the XSD?
To the extent that the XSD contributes to the interpretation of the XML, is that contribution primarily formatting for display of the XML’s structure, or instructions on interpreting the XML’s meaningful contents?
Also, is there human readable documentation on the structure of IPUMS-CPS XML files in particular, or are those files considered to be self-documenting?
Should I be looking at my XML file with a browser? I have been using Notepad++ because it does XML syntax highlighting, which it seems to do pretty well. (Also on the XSD file). But if it needs to grab the XSD file from http://www.ddialliance.org/ in order to display correctly, I don’t know if it does that – probably not.
Chrome opens up the XSD file as indented text (without syntax highlighting). When I try to open up the downloaded XML file for on of my samples with Chrome, nothing happens. Do I need a plug-in?
Let me tell you what I want to do, as that may clarify the meaning of my questions. I am hoping to write code to automatically extract metadata from the XML file for a sample that will give PostgreSQL what it needs to read the file, and second, give R what it needs to interpret the data it then imports from PostgreSQL. Finally, I want to build functions to pull out anything in the file that is really aimed at a human, e.g. the text variable descriptions, and produce a little report on a single variable, or a big report on all the variables, on request.
Does this seem like a sensible, appropriate way to use the XML documentation? Or have I misinterpreted its purpose?
Looking forward to your response, Andrew H.