<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Alexander Kiel &#187; Web Standards</title>
	<atom:link href="http://www.alexanderkiel.net/category/web/web_standards/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.alexanderkiel.net</link>
	<description>On Photography and other Things</description>
	<lastBuildDate>Wed, 27 Jan 2010 15:13:04 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Using wget and the WDG Offline Validator to link check and validate your whole web site</title>
		<link>http://www.alexanderkiel.net/2007/09/18/using-wget-and-the-wdg-offline-validator-to-link-check-and-validate-your-whole-web-site/</link>
		<comments>http://www.alexanderkiel.net/2007/09/18/using-wget-and-the-wdg-offline-validator-to-link-check-and-validate-your-whole-web-site/#comments</comments>
		<pubDate>Tue, 18 Sep 2007 18:10:13 +0000</pubDate>
		<dc:creator>Alexander Kiel</dc:creator>
				<category><![CDATA[Web Development]]></category>
		<category><![CDATA[Web Standards]]></category>

		<guid isPermaLink="false">http://alexk.homeip.net/2007/09/18/using-wget-and-the-wdg-offline-validator-to-link-check-and-validate-your-whole-web-site/</guid>
		<description><![CDATA[If you have a large web site or web application and care about dead links and valid HTML, it is a real pain to check for this. While I did run into this issue I collected some tools which I will present you now. 

While the fastest way for manual link checking is the LinkChecker [...]]]></description>
			<content:encoded><![CDATA[<p>If you have a large web site or web application and care about dead links and valid <acronym title="HyperText Markup Language">HTML</acronym>, it is a real pain to check for this. While I did run into this issue I collected some tools which I will present you now. <span id="more-35"></span></p>

<p>While the fastest way for manual link checking is the <a href="https://addons.mozilla.org/en-US/firefox/addon/532">LinkChecker Firefox Plugin</a>, it is not so easy to check our whole site for 404&#8217;s. Same thing with validation. For fast manual checking of a single page I would recommend to install the <a href="https://addons.mozilla.org/en-US/firefox/addon/60">Web Developer Toolbar</a> in Firefox and simply press Shift + Ctrl + H or Shift + Ctrl + A. But how to validate all pages or our site? And how doing it offline for performance reasons?</p>

<p>My approach is to mirror the whole web site with wget, look at the wget log for dead links and use the <acronym title="Web Design Group">WDG</acronym> Offline Validator to validate the mirrored <acronym title="HyperText Markup Language">HTML</acronym> pages.</p>

<h2>Mirror with wget</h2>

<p>I assume, you use Linux or a Unix like system. So wget wouldn&#8217;t be new for you or you will be able to get it for your system. It&#8217;s a pretty basic but powerful tool.</p>

<p>First you should create a new directory where wget could download all your pages. On your console you can execute this command line to get your whole site:</p>

<ol class="code"><li class="alt"><code>wget --mirror --keep-session-cookies -o wget.log</code></li></ol>

<p>I use the <code>--mirror</code> switch to simply fetch all. The <code>--keep-session-cookies</code> switch is useful is your site is dynamically created as this blog for example. <code>-o wget.log</code> says, it should put the output into this file. Be sure your server would hold against the stress!</p>

<p>Once wget finishes, you could use <code>less</code> or your favorite editor to search inside the wget.log for 404&#8217;s and the string <code>error</code>. This is all what you will need for link checking.</p>

<h2>Validation with the <acronym title="Web Design Group">WDG</acronym> Offline Validator</h2>

<p>I searched a while for a usable offline validator. The W3C one is a <acronym title="Common Gateway Interface">CGI</acronym> script which needs a running Apache and you have to do a <acronym title="Hypertext Transfer Protocol">HTTP</acronym> post in order to check your local <acronym title="HyperText Markup Language">HTML</acronym> file. It is basically the same thing as the public W3C Validator. The next one I did not choose is a Windows application called <a href="http://arealvalidator.com/">A Real Validator</a>. The disadvantages are that it costs money and has a little bit dated <acronym title="Graphical User Interface">GUI</acronym> which does not allow to filter for only invalid pages. So you have to scroll though hundreds of valid pages to find your invalide ones.</p>

<p>So at the end I use the <a href="http://htmlhelp.com/tools/validator/offline/index.html.en"><acronym title="Web Design Group">WDG</acronym> Offline Validator</a>. You can get it from this site but the best thing is that <a href="http://www.debian.org/">Debian</a> and <a href="http://www.ubuntu.com">Ubuntu</a> have it available in there package repositories. So you can just type:</p>

<ol class="code"><li class="alt"><code>sudo apt-get install wdg-html-validator</code></li></ol>

<p>(Be sure you have the universe repository in our list.)</p>

<p>To validate all the <acronym title="HyperText Markup Language">HTML</acronym> pages wget downloaded, just type:</p>

<ol class="code"><li class="alt"><code>find . -name "*.html" -exec validate -w {} \; &gt; validation.log</code></li></ol>

<p>This command finds all yout <acronym title="HyperText Markup Language">HTML</acronym> files in your current directory, executes the validate command on everyone and outputs the results in the validation.log file. While this runs, you can look at the validation.log with tail or you can view it afterwards in whatever editor you like best.</p>

<p>So that is basically all what you need to check your whole site for dead links and valid <acronym title="HyperText Markup Language">HTML</acronym>.</p>]]></content:encoded>
			<wfw:commentRss>http://www.alexanderkiel.net/2007/09/18/using-wget-and-the-wdg-offline-validator-to-link-check-and-validate-your-whole-web-site/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Some thoughts on Web Standards</title>
		<link>http://www.alexanderkiel.net/2007/08/30/some-thoughts-on-web-standards/</link>
		<comments>http://www.alexanderkiel.net/2007/08/30/some-thoughts-on-web-standards/#comments</comments>
		<pubDate>Thu, 30 Aug 2007 12:52:34 +0000</pubDate>
		<dc:creator>Alexander Kiel</dc:creator>
				<category><![CDATA[Web Development]]></category>
		<category><![CDATA[Web Standards]]></category>

		<guid isPermaLink="false">http://alexk.homeip.net/2007/08/30/some-thoughts-on-web-standards/</guid>
		<description><![CDATA[Today I&#8217;m on the way to renew my blog. One question that I ask me is: Should I use HTML 4.01 Strict or XHTML 1.0 Strict? To decide this I currently look thought blogs and web pages of well known web standard gurus. 

On 456 Berea Street I read the following on the Accessibility Page:



This [...]]]></description>
			<content:encoded><![CDATA[<p>Today I&#8217;m on the way to renew my blog. One question that I ask me is: Should I use <acronym title="HyperText Markup Language">HTML</acronym> 4.01 Strict or <acronym title="Extensible HyperText Markup Language">XHTML</acronym> 1.0 Strict? To decide this I currently look thought blogs and web pages of well known web standard gurus. <span id="more-5"></span></p>

<p>On <a href="http://www.456bereastreet.com">456 Berea Street</a> I read the following on the <a href="http://www.456bereastreet.com/accessibility/">Accessibility Page</a>:</p>

<blockquote>
<p>
This site is built on valid <acronym title="HyperText Markup Language">HTML</acronym> 4.01 Strict for structure and <acronym title="Cascading Style Sheets">CSS</acronym> for presentation.
</p>
<p>
A modern web browser like Firefox, Safari or Opera is needed to make the most out of this site, but thanks to the separation of content and presentation it should be accessible to any browsing device, including Internet Explorer.
</p>
</blockquote>

<p>Take this as the funny part of this post. Now I&#8217;m going into the silly <acronym title="HyperText Markup Language">HTML</acronym> vs. <acronym title="Extensible HyperText Markup Language">XHTML</acronym> discussion.</p>

<h3><acronym title="HyperText Markup Language">HTML</acronym> 4.01 Strict?</h3>

<p>So the real cracks use <acronym title="HyperText Markup Language">HTML</acronym> 4.01 Strict today even if they used <acronym title="Extensible HyperText Markup Language">XHTML</acronym> in the past. The key point choosing <acronym title="Extensible HyperText Markup Language">XHTML</acronym> is senseless today is that you have to deliver (X)<acronym title="HyperText Markup Language">HTML</acronym> pages as text/html because the famous Internet Explorer doesn&#8217;t understand application/xhtml+xml. But if you deliver as text/html all browsers interpret your nice <acronym title="Extensible HyperText Markup Language">XHTML</acronym> as tag soup anyway. Thats why some of the people caring about this stuff switched back to <acronym title="HyperText Markup Language">HTML</acronym>.</p>

<p>After playing some time with Wordpress and its Themes/Plugins I realized that it is not simple to switch to <acronym title="HyperText Markup Language">HTML</acronym> 4.01 Strict. The whole Wordpress world uses <acronym title="Extensible HyperText Markup Language">XHTML</acronym> 1.0 and every single peace outputs <code> /></code> instead of <code>></code> on empty tags. So without rewriting nearly all Wordpress code, it would not be possible to generate real valid <acronym title="HyperText Markup Language">HTML</acronym>.</p>

<p>The other fact that pushes me towards <acronym title="Extensible HyperText Markup Language">XHTML</acronym> is a article of Christoph Schneegans which reads <a class="external" href="http://schneegans.de/web/xhtml/" hreflang="de"><acronym title="Extensible HyperText Markup Language">XHTML</acronym> oder <acronym title="HyperText Markup Language">HTML</acronym>?</a>. He says that a <acronym title="HyperText Markup Language">HTML</acronym> Validator wouldn&#8217;t complain about valid SGML shortcuts which can cause rendering errors in browsers. To cite him &#8211; this markup is perfectly valid HTML:</p>

<blockquote cite="http://schneegans.de/web/xhtml/">
<ol class="code">
<li><code>&lt;!DOCTYPE html PUBLIC "-//W3C//<acronym title="Document Type Definition">DTD</acronym> <acronym title="HyperText Markup Language">HTML</acronym> 4.01//EN"&gt;</code></li>
<li><code>&lt;&gt;</code></li>
<li><code>&lt;title//</code></li>
<li><code>&lt;p ltr&lt;span&gt;&lt;/span&lt;/p&gt;</code></li>
<li><code>&lt;/&gt;</code></li>
</ol>
</blockquote>

<p>and it is equivalent to:</p>

<blockquote cite="http://schneegans.de/web/xhtml/">
<ol class="code">
<li><code>&lt;!DOCTYPE html PUBLIC "-//W3C//<acronym title="Document Type Definition">DTD</acronym> <acronym title="HyperText Markup Language">HTML</acronym> 4.01//EN"&gt;</code></li>
<li><code>&lt;html&gt;</code></li>
<li><code>&lt;head&gt;</code></li>
<li><code>&lt;title&gt;&lt;/title&gt;</code></li>
<li><code>&lt;body&gt;</code></li>
<li><code>&lt;p dir="ltr"&gt;&lt;span&gt;&lt;/span&gt;&lt;/p&gt;</code></li>
<li><code>&lt;/body&gt;</code></li>
<li><code>&lt;/html&#038;gt</code></li>
</ol>
</blockquote>

<p><strong>To summarize:</strong> I will use <acronym title="Extensible HyperText Markup Language">XHTML</acronym> 1.0 Strict delivered as &#8220;text/html&#8221; for now and maybe (X)HTML5 later on in some years.</p>]]></content:encoded>
			<wfw:commentRss>http://www.alexanderkiel.net/2007/08/30/some-thoughts-on-web-standards/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

