<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Kier&#039;s Blog &#187; Internet</title>
	<atom:link href="http://www.kierdugan.com/category/software/internet/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.kierdugan.com</link>
	<description>Damn right.</description>
	<lastBuildDate>Fri, 11 Mar 2011 23:36:36 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>PowerPython&#8230; or PythonPoint&#8230; or something</title>
		<link>http://www.kierdugan.com/2010/07/08/powerpython-or-pythonpoint-or-something/</link>
		<comments>http://www.kierdugan.com/2010/07/08/powerpython-or-pythonpoint-or-something/#comments</comments>
		<pubDate>Wed, 07 Jul 2010 23:17:44 +0000</pubDate>
		<dc:creator>Kier</dc:creator>
				<category><![CDATA[Internet]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Uni]]></category>
		<category><![CDATA[Windows]]></category>
		<category><![CDATA[COM]]></category>
		<category><![CDATA[Email]]></category>
		<category><![CDATA[IMAP]]></category>
		<category><![CDATA[PowerPoint]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[pythoncom]]></category>
		<category><![CDATA[win32com]]></category>

		<guid isPermaLink="false">http://www.kierdugan.com/?p=96</guid>
		<description><![CDATA[I&#8217;ve been meaning to update this for a fair while now as an uncharacteristically large amount of stuff has happened.<a href="http://www.kierdugan.com/2010/07/08/powerpython-or-pythonpoint-or-something/" class="searchmore">Read the Rest...</a><div class="clr"></div>]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been meaning to update this for a fair while now as an uncharacteristically large amount of stuff has happened. Since exams finished I&#8217;ve managed to get a job at <a title="Electronics and Computer Science" href="http://www.ecs.soton.ac.uk/" target="_blank">ECS</a> working for two of my lecturers on two separate projects, which is pretty good because it means my work is varied. Both are IC design projects though, so there is a similar vein running through them.</p>
<p>One of my minor duties on this dual-job is to assemble slides from about twelve people into a large presentation, with cover slides for each speaker, every Friday for a progress meeting we all have. Naturally the first Friday I just did it by hand by importing each one in turn into PowerPoint. However it is a fairly tedious job, and to paraphrase a certain member of staff: why do something by hand when I have a powerful computer under the desk?</p>
<p>So I began to investigate automating the process.</p>
<p>Turns out that <a title="Python" href="http://www.python.org" target="_blank">Python</a> has an <a title="imaplib" href="http://docs.python.org/library/imaplib.html" target="_blank">IMAP module</a> in its standard library, which isn&#8217;t <em>too </em>surprising I suppose as the Python standard library is <strong>enormous</strong>. After some playing I managed to write a program that logged into my university email account and downloaded the appropriate PowerPoint attachments.</p>
<p><span id="more-96"></span></p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #ff7700;font-weight:bold;">from</span> <span style="color: #dc143c;">imaplib</span> <span style="color: #ff7700;font-weight:bold;">import</span> IMAP4_SSL
<span style="color: #ff7700;font-weight:bold;">from</span> <span style="color: #dc143c;">email</span> <span style="color: #ff7700;font-weight:bold;">import</span> message_from_string
<span style="color: #ff7700;font-weight:bold;">from</span> <span style="color: #dc143c;">datetime</span> <span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">datetime</span>, timedelta
<span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">re</span>
&nbsp;
M = IMAP4_SSL <span style="color: black;">&#40;</span><span style="color: #483d8b;">'imapserver'</span><span style="color: black;">&#41;</span>
M.<span style="color: black;">login</span> <span style="color: black;">&#40;</span><span style="color: #483d8b;">'user'</span>, <span style="color: #483d8b;">'pass'</span><span style="color: black;">&#41;</span>
M.<span style="color: #dc143c;">select</span> <span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
&nbsp;
since = <span style="color: black;">&#40;</span><span style="color: #dc143c;">datetime</span>.<span style="color: black;">today</span> <span style="color: black;">&#40;</span><span style="color: black;">&#41;</span> - timedelta <span style="color: black;">&#40;</span>days=<span style="color: #ff4500;">5</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>.<span style="color: black;">strftime</span> <span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;%d-%b-%Y&quot;</span><span style="color: black;">&#41;</span>
&nbsp;
<span style="color: #ff7700;font-weight:bold;">for</span> <span style="color: #dc143c;">user</span> <span style="color: #ff7700;font-weight:bold;">in</span> users:
    typ, data = M.<span style="color: black;">search</span> <span style="color: black;">&#40;</span><span style="color: #008000;">None</span>, <span style="color: #483d8b;">'(HEADER FROM &quot;%s&quot; SINCE %s)'</span> <span style="color: #66cc66;">%</span> <span style="color: black;">&#40;</span><span style="color: #dc143c;">user</span>, since<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>
&nbsp;
    <span style="color: #ff7700;font-weight:bold;">for</span> <span style="color: #008000;">id</span> <span style="color: #ff7700;font-weight:bold;">in</span> data<span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span>.<span style="color: black;">split</span> <span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>:
        typ, data = M.<span style="color: black;">fetch</span> <span style="color: black;">&#40;</span><span style="color: #008000;">id</span>, <span style="color: #483d8b;">'RFC822'</span><span style="color: black;">&#41;</span>
        msg = message_from_string <span style="color: black;">&#40;</span>data<span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span>
        <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #ff7700;font-weight:bold;">not</span> msg.<span style="color: black;">is_multipart</span> <span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>:
            <span style="color: #ff7700;font-weight:bold;">continue</span>
&nbsp;
        <span style="color: #ff7700;font-weight:bold;">for</span> part <span style="color: #ff7700;font-weight:bold;">in</span> msg.<span style="color: black;">walk</span> <span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>:
            fn  = part.<span style="color: black;">get_filename</span> <span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
            ext = <span style="color: #dc143c;">re</span>.<span style="color: black;">match</span> <span style="color: black;">&#40;</span>r<span style="color: #483d8b;">'.*<span style="color: #000099; font-weight: bold;">\.</span>(pptx?)$'</span>, fn<span style="color: black;">&#41;</span>:
            <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #ff7700;font-weight:bold;">not</span> ext
                <span style="color: #ff7700;font-weight:bold;">continue</span>
&nbsp;
            fp = <span style="color: #008000;">file</span> <span style="color: black;">&#40;</span><span style="color: #483d8b;">'%s.%s.temp'</span> <span style="color: #66cc66;">%</span> <span style="color: black;">&#40;</span><span style="color: #dc143c;">user</span>, ext.<span style="color: black;">group</span> <span style="color: black;">&#40;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>, <span style="color: #483d8b;">'w'</span><span style="color: black;">&#41;</span>
            fp.<span style="color: black;">write</span> <span style="color: black;">&#40;</span>part.<span style="color: black;">get_payload</span> <span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>
            fp.<span style="color: black;">close</span> <span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
&nbsp;
M.<span style="color: black;">close</span> <span style="color: black;">&#40;</span><span style="color: black;">&#41;</span></pre></div></div>

<p>This code would download any <code>.ppt</code> or <code>.pptx</code> attachment from anyone with an email address in the <code>users</code> list and save the file as <code>&lt;user&gt;.ppt.temp</code> or <code>&lt;user&gt;.pptx.temp</code>. It should be possible to decode the message part with <a title="email.message.Message.get_payload" href="http://docs.python.org/library/email.message.html#email.message.Message.get_payload" target="_blank"><code>msg.get_payload (decode=True)</code></a> but it seemed to introduce a fair number of errors into the file. I think this is because it seems to convert line-by-line instead of as one large block. So I used the following code to fix this.</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;">raw      = part.<span style="color: black;">get_payload</span> <span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
raw      = <span style="color: #dc143c;">re</span>.<span style="color: black;">sub</span> <span style="color: black;">&#40;</span>r<span style="color: #483d8b;">'[<span style="color: #000099; font-weight: bold;">\r</span><span style="color: #000099; font-weight: bold;">\n</span>]+'</span>, <span style="color: #483d8b;">''</span>, raw<span style="color: black;">&#41;</span>
encoding = part<span style="color: black;">&#91;</span><span style="color: #483d8b;">'Content-Transfer-Encoding'</span><span style="color: black;">&#93;</span>.<span style="color: black;">lower</span> <span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
<span style="color: #ff7700;font-weight:bold;">if</span> encoding == <span style="color: #483d8b;">'base64'</span>:
    raw = b64decode <span style="color: black;">&#40;</span>raw<span style="color: black;">&#41;</span>
<span style="color: #ff7700;font-weight:bold;">elif</span> encoding == <span style="color: #483d8b;">'quoted-printable'</span>:
    raw = <span style="color: #dc143c;">quopri</span>.<span style="color: black;">decodestring</span> <span style="color: black;">&#40;</span>raw<span style="color: black;">&#41;</span>
<span style="color: #ff7700;font-weight:bold;">else</span>:
    <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">&quot;ERROR: Unknown coding strategy - ignoring.&quot;</span>
    <span style="color: #ff7700;font-weight:bold;">continue</span></pre></div></div>

<p>Not particularly pretty, but it does the job.</p>
<p>So now I had all the emails downloaded, but I still needed to merge them all into a single file. At first I was thinking of making a C++ program to exploit the PowerPoint COM interface, but then I found <a title="Python for Windows Exentsions" href="http://starship.python.net/crew/mhammond/win32/" target="_blank">Python for Windows Extensions</a> which fully supports COM!</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;">pythoncom.<span style="color: black;">CoInitializeEx</span> <span style="color: black;">&#40;</span>pythoncom.<span style="color: black;">COINIT_APARTMENTTHREADED</span><span style="color: black;">&#41;</span>
gencache.<span style="color: black;">EnsureModule</span> <span style="color: black;">&#40;</span><span style="color: #483d8b;">'{2DF8D04C-5BFA-101B-BDE5-00AA0044DE52}'</span>, <span style="color: #ff4500;">0</span>, <span style="color: #ff4500;">2</span>, <span style="color: #ff4500;">4</span><span style="color: black;">&#41;</span>
gencache.<span style="color: black;">EnsureDispatch</span> <span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;PowerPoint.Application.12&quot;</span><span style="color: black;">&#41;</span>
&nbsp;
<span style="color: #808080; font-style: italic;"># Create an instance of PowerPoint and presentation</span>
pp = Dispatch <span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;PowerPoint.Application.12&quot;</span><span style="color: black;">&#41;</span>
pp.<span style="color: black;">Activate</span> <span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
pres = pp.<span style="color: black;">Presentations</span>.<span style="color: black;">Add</span> <span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
&nbsp;
<span style="color: #808080; font-style: italic;"># Insert all the downloaded presentations</span>
count = <span style="color: #ff4500;">1</span>
<span style="color: #ff7700;font-weight:bold;">for</span> filename <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #dc143c;">os</span>.<span style="color: black;">listdir</span> <span style="color: black;">&#40;</span><span style="color: #483d8b;">'presentations'</span><span style="color: black;">&#41;</span>:
    pres.<span style="color: black;">Slides</span>.<span style="color: black;">InsertFromFile</span> <span style="color: black;">&#40;</span><span style="color: #dc143c;">os</span>.<span style="color: black;">path</span>.<span style="color: black;">realpath</span> <span style="color: black;">&#40;</span>filename<span style="color: black;">&#41;</span>, count<span style="color: black;">&#41;</span>
    count = pres.<span style="color: black;">Slides</span>.<span style="color: black;">Count</span>
&nbsp;
<span style="color: #808080; font-style: italic;"># Save and exit</span>
pres.<span style="color: black;">SaveAs</span> <span style="color: black;">&#40;</span><span style="color: #483d8b;">'compiled.pptx'</span><span style="color: black;">&#41;</span>
pres.<span style="color: black;">Close</span> <span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
pp.<span style="color: black;">Quit</span> <span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
pythoncom.<span style="color: black;">CoUninitialize</span> <span style="color: black;">&#40;</span><span style="color: black;">&#41;</span></pre></div></div>

<p>I was quite shocked at how straightforward it was to automate PowerPoint from Python, but this solution wasn&#8217;t quite good enough yet. Using <a title="MSDN - InsertFromFile" href="http://msdn.microsoft.com/en-us/library/bb265418%28v=office.12%29.aspx" target="_blank"><code>InsertFromFile</code></a> means that the imported presentation acquires the formatting of <code>pres</code> which is not what I wanted. Also, there appears to be a bug in PowerPoint 2007 which causes image references to be broken when importing from a <code>.pptx</code> into a <code>.pptx</code> with the COM interface.</p>
<p>Searching for a solution to the <em>import with formatting</em> issue lead me to <a title="CopyWithSourceFormating" href="http://skp.mvps.org/pptxp001.htm" target="_blank">this awesome VBA function</a> which has been referenced many, many times. I ported this to Python and it worked perfectly! There was still the weird image problem, but I used a really crude fix for that:</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #ff7700;font-weight:bold;">for</span> filename <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: black;">&#91;</span>f <span style="color: #ff7700;font-weight:bold;">for</span> f <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #dc143c;">os</span>.<span style="color: black;">listdir</span> <span style="color: black;">&#40;</span><span style="color: #483d8b;">'presentations'</span><span style="color: black;">&#41;</span> <span style="color: #ff7700;font-weight:bold;">if</span> f<span style="color: black;">&#91;</span>-<span style="color: #ff4500;">4</span>:<span style="color: black;">&#93;</span> == <span style="color: #483d8b;">'pptx'</span><span style="color: black;">&#93;</span>:
    pres = pp.<span style="color: black;">Presentations</span>.<span style="color: black;">Open</span> <span style="color: black;">&#40;</span>path.<span style="color: black;">realpath</span> <span style="color: black;">&#40;</span>filename<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>
    pres.<span style="color: black;">SaveAs</span> <span style="color: black;">&#40;</span>path.<span style="color: black;">realpath</span> <span style="color: black;">&#40;</span>filename<span style="color: black;">&#91;</span>:-<span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>
    pres.<span style="color: black;">Close</span> <span style="color: black;">&#40;</span><span style="color: black;">&#41;</span></pre></div></div>

<p>Yes. I converted all the <code>.pptx</code>&#8217;s into <code>.ppt</code>&#8217;s. Nothing intelligent here. Finally, I wrote a function to replace all the fonts added by the import with Arial. Job done.</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;">fonts = <span style="color: black;">&#123;</span><span style="color: black;">&#125;</span>
<span style="color: #ff7700;font-weight:bold;">for</span> i <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #008000;">range</span> <span style="color: black;">&#40;</span><span style="color: #ff4500;">1</span>, pres.<span style="color: black;">Fonts</span>.<span style="color: black;">Count</span> + <span style="color: #ff4500;">1</span><span style="color: black;">&#41;</span>:
    fonts<span style="color: black;">&#91;</span>pres.<span style="color: black;">Fonts</span>.<span style="color: black;">Item</span> <span style="color: black;">&#40;</span>i<span style="color: black;">&#41;</span>.<span style="color: black;">Name</span><span style="color: black;">&#93;</span> = <span style="color: #ff4500;">1</span>
fonts = fonts.<span style="color: black;">keys</span> <span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
fonts.<span style="color: black;">remove</span> <span style="color: black;">&#40;</span>u<span style="color: #483d8b;">'Arial'</span><span style="color: black;">&#41;</span>
<span style="color: #ff7700;font-weight:bold;">for</span> font <span style="color: #ff7700;font-weight:bold;">in</span> fonts:
    pres.<span style="color: black;">Fonts</span>.<span style="color: black;">Replace</span> <span style="color: black;">&#40;</span>font, u<span style="color: #483d8b;">'Arial'</span><span style="color: black;">&#41;</span></pre></div></div>

<p>So after all that I have a Python program that logs into my email, downloads a load of PowerPoint files, converts them all to <code>.ppt</code> format, inserts them into a blank presentation, and then normalises the font to Arial. Not bad for a few hundred lines of code!</p>
<p>I did change the program a bit so that it would copy a template with title slides in and add the presentations to that instead of a blank file, and then set the date appropriately on the main title slide. But the key point here is that I&#8217;ve replaced my tedious Friday-morning activity with a single command: <code>makepres</code>.</p>
<p>Victory.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kierdugan.com/2010/07/08/powerpython-or-pythonpoint-or-something/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Twitter Me Xerces!</title>
		<link>http://www.kierdugan.com/2010/03/25/twitter-me-xerces/</link>
		<comments>http://www.kierdugan.com/2010/03/25/twitter-me-xerces/#comments</comments>
		<pubDate>Thu, 25 Mar 2010 00:03:42 +0000</pubDate>
		<dc:creator>Kier</dc:creator>
				<category><![CDATA[Internet]]></category>
		<category><![CDATA[Random]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Windows]]></category>
		<category><![CDATA[HTTP]]></category>
		<category><![CDATA[libcurl]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[REST]]></category>
		<category><![CDATA[Twitter]]></category>
		<category><![CDATA[Xerces]]></category>
		<category><![CDATA[XML]]></category>

		<guid isPermaLink="false">http://www.kierdugan.com/?p=58</guid>
		<description><![CDATA[Following from the spirit of yesterdays post, little victories&#8230;
Yesterday I managed to download the front page of my website using<a href="http://www.kierdugan.com/2010/03/25/twitter-me-xerces/" class="searchmore">Read the Rest...</a><div class="clr"></div>]]></description>
			<content:encoded><![CDATA[<p>Following from the spirit of yesterdays post, little victories&#8230;</p>
<p>Yesterday I managed to download the front page of my website using <a href="http://curl.haxx.se/libcurl/" target="_blank">libcurl</a>. As good as that was as a learning experience, it wasn&#8217;t interesting or useful in the slightest. Today however, I decided to see if I could fetch my status updates from Twitter and display them in a program. So I had a look at the API documentation and it looks quite easy to use, with the exception of OAuth which I&#8217;m yet to get my head around. Thankfully, for now, basic authentication is still supported.</p>
<p>The Twitter API uses the REST (REpresentational State Transfer) paradigm which means there&#8217;s no concept of a <em>state</em> on the server; i.e. each transaction is considered separately. It also means that it uses HTTP, which is pretty simple to understand. Basically in a REST protocol the URI&#8217;s are objects in the system, and the HTTP verbs are how you interact with them. So a GET on a <span style="font-family: Courier New;">http://server/article?name=REST</span> object would download an <em>article</em> named <em>REST</em>. Simple eh? Check <a href="http://www.codeproject.com/KB/architecture/RESTWebServicesPart2.aspx" target="_blank">this article</a> if you&#8217;re interested.</p>
<p>Anyway, onto the meat &#8216;n&#8217; taters. Data in a REST transaction is typically stored as XML or JSON. I considered downloading <a href="http://pyyaml.org/wiki/LibYAML" target="_blank">LibYAML</a> and taking the JSON route but a) I already had <a href="http://xerces.apache.org/xerces-c" target="_blank">Xerces</a>, b) I understand XML more than JSON, and c) I couldn&#8217;t be bothered to learn yet another new thing.</p>
<p><span id="more-58"></span>Xerces is incredibly well written. If you look at the class listings of Xerces or <a href="http://xml.apache.org/xalan-c/" target="_blank">Xalan</a> you&#8217;ll appreciate they&#8217;re both <strong>enormous</strong> and support basically everything. In fact, right out of the box Xerces supports fetching XML documents over the internet using HTTP GET. I chose not to use this purely because I wanted to use libcurl. Thankfully libcurl is surprisingly easy to use:</p>

<div class="wp_syntax"><div class="code"><pre class="cpp" style="font-family:monospace;"><span style="color: #0000ff;">static</span> <span style="color: #0000ff;">size_t</span> _CurlWriteCB <span style="color: #008000;">&#40;</span><span style="color: #0000ff;">void</span><span style="color: #000040;">*</span> ptr, <span style="color: #0000ff;">size_t</span> nLen, <span style="color: #0000ff;">size_t</span> cbElem,
                            CMemFile<span style="color: #000040;">*</span> pFile<span style="color: #008000;">&#41;</span>
<span style="color: #008000;">&#123;</span>
    <span style="color: #0000ff;">size_t</span> cbSizeAtStart<span style="color: #008080;">;</span>
    <span style="color: #0000ff;">size_t</span> cbSizeAtEnd<span style="color: #008080;">;</span>
&nbsp;
    <span style="color: #666666;">// Write data to file, but measure buffer size before and after.</span>
    cbSizeAtStart <span style="color: #000080;">=</span> <span style="color: #008000;">&#40;</span><span style="color: #0000ff;">size_t</span><span style="color: #008000;">&#41;</span>pFile<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>GetLength <span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
    pFile<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>Write <span style="color: #008000;">&#40;</span>ptr, <span style="color: #008000;">&#40;</span>UINT<span style="color: #008000;">&#41;</span><span style="color: #008000;">&#40;</span>nLen <span style="color: #000040;">*</span> cbElem<span style="color: #008000;">&#41;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
    cbSizeAtEnd   <span style="color: #000080;">=</span> <span style="color: #008000;">&#40;</span><span style="color: #0000ff;">size_t</span><span style="color: #008000;">&#41;</span>pFile<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>GetLength <span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
&nbsp;
    <span style="color: #666666;">// Return the difference in buffer size, i.e. number of bytes written.</span>
    <span style="color: #0000ff;">return</span> <span style="color: #008000;">&#40;</span>cbSizeAtEnd <span style="color: #000040;">-</span> cbSizeAtStart<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
<span style="color: #008000;">&#125;</span>
&nbsp;
BYTE<span style="color: #000040;">*</span> GetStatusesFromTwitter <span style="color: #008000;">&#40;</span><span style="color: #0000ff;">char</span><span style="color: #000040;">*</span> szUserName, UINT<span style="color: #000040;">&amp;</span> uiSize<span style="color: #008000;">&#41;</span>
<span style="color: #008000;">&#123;</span>
    <span style="color: #666666;">// Attempt to initialise curl</span>
    CURL<span style="color: #000040;">*</span> curl <span style="color: #000080;">=</span> curl_easy_init <span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
    <span style="color: #0000ff;">if</span> <span style="color: #008000;">&#40;</span>curl <span style="color: #000040;">!</span><span style="color: #000080;">=</span> <span style="color: #0000ff;">NULL</span><span style="color: #008000;">&#41;</span> <span style="color: #008000;">&#123;</span>
        <span style="color: #666666;">// Set up the http target</span>
        CString strFmt<span style="color: #008080;">;</span>
        strFmt.<span style="color: #007788;">Format</span> <span style="color: #008000;">&#40;</span>IDS_TWITTER_STATUS, szUserName<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
        curl_easy_setopt <span style="color: #008000;">&#40;</span>curl, CURLOPT_URL, strFmt.<span style="color: #007788;">GetString</span> <span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
&nbsp;
        <span style="color: #666666;">// Save the result into memory for now.</span>
        CMemFile buffer<span style="color: #008080;">;</span>
        curl_easy_setopt <span style="color: #008000;">&#40;</span>curl, CURLOPT_WRITEFUNCTION, _CurlWriteCB<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
        curl_easy_setopt <span style="color: #008000;">&#40;</span>curl, CURLOPT_WRITEDATA,     <span style="color: #000040;">&amp;</span>buffer<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
&nbsp;
        <span style="color: #666666;">// Attempt to grab the data from Twitter.</span>
        CURLcode res <span style="color: #000080;">=</span> curl_easy_perform <span style="color: #008000;">&#40;</span>curl<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
        curl_easy_cleanup <span style="color: #008000;">&#40;</span>curl<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
&nbsp;
        <span style="color: #666666;">// Return the data.</span>
        <span style="color: #0000ff;">if</span> <span style="color: #008000;">&#40;</span>res <span style="color: #000080;">==</span> <span style="color: #0000dd;">0</span><span style="color: #008000;">&#41;</span> <span style="color: #008000;">&#123;</span>
            uiSize <span style="color: #000080;">=</span> <span style="color: #008000;">&#40;</span>UINT<span style="color: #008000;">&#41;</span>buffer.<span style="color: #007788;">GetLength</span> <span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
            <span style="color: #0000ff;">return</span> buffer.<span style="color: #007788;">Detach</span> <span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
        <span style="color: #008000;">&#125;</span>
    <span style="color: #008000;">&#125;</span>
&nbsp;
    <span style="color: #0000ff;">return</span> <span style="color: #0000ff;">NULL</span><span style="color: #008080;">;</span>
<span style="color: #008000;">&#125;</span></pre></div></div>

<p>The above listing will download a users tweets and store them in a growable buffer (see <a href="http://msdn.microsoft.com/en-us/library/tzdxd4x0.aspx" target="_blank">CMemFile</a>). But now we have to present this to Xerces in a way that it will understand. Thankfully we can supply an arbitrary <a href="http://xerces.apache.org/xerces-c/apiDocs-2/classInputSource.html" target="_blank">InputSource</a> to a <a href="http://xerces.apache.org/xerces-c/apiDocs-2/classXercesDOMParser.html" target="_blank">DOMParser</a>, including one that will <a href="http://xerces.apache.org/xerces-c/apiDocs-2/classMemBufInputSource.html" target="_blank">wrap a piece of memory</a>.</p>

<div class="wp_syntax"><div class="code"><pre class="cpp" style="font-family:monospace;"><span style="color: #0000ff;">bool</span> DoGetStatuses <span style="color: #008000;">&#40;</span><span style="color: #0000ff;">char</span><span style="color: #000040;">*</span> szUserName<span style="color: #008000;">&#41;</span>
<span style="color: #008000;">&#123;</span>
    <span style="color: #666666;">// Query Twitter</span>
    UINT  uiSize<span style="color: #008080;">;</span>
    BYTE<span style="color: #000040;">*</span> pbData <span style="color: #000080;">=</span> GetStatusesFromTwitter <span style="color: #008000;">&#40;</span>szUserName, uiSize<span style="color: #008000;">&#41;</span>
    <span style="color: #0000ff;">if</span> <span style="color: #008000;">&#40;</span>pbData <span style="color: #000080;">==</span> <span style="color: #0000ff;">NULL</span><span style="color: #008000;">&#41;</span>
        <span style="color: #0000ff;">return</span> <span style="color: #0000ff;">false</span><span style="color: #008080;">;</span>
&nbsp;
    <span style="color: #666666;">// Move the memory into an object Xerces understands.</span>
    MemBufInputSource<span style="color: #000040;">*</span> pDataSrc <span style="color: #000080;">=</span> <span style="color: #0000dd;">new</span> MemBufInputSource
        <span style="color: #008000;">&#40;</span>pbData, uiSize, L<span style="color: #FF0000;">&quot;TwitterXML&quot;</span>, <span style="color: #0000ff;">true</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
&nbsp;
    <span style="color: #666666;">// Parse the data</span>
    XercesDOMParser parser<span style="color: #008080;">;</span>
    parser.<span style="color: #007788;">setValidationScheme</span> <span style="color: #008000;">&#40;</span>XercesDOMParser<span style="color: #008080;">::</span><span style="color: #007788;">Val_Never</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
    parser.<span style="color: #007788;">setDoNamespaces</span> <span style="color: #008000;">&#40;</span><span style="color: #0000ff;">false</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
    parser.<span style="color: #007788;">setDoSchema</span> <span style="color: #008000;">&#40;</span><span style="color: #0000ff;">false</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
    parser.<span style="color: #007788;">setDoValidation</span> <span style="color: #008000;">&#40;</span><span style="color: #0000ff;">false</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
    parser.<span style="color: #007788;">parse</span> <span style="color: #008000;">&#40;</span><span style="color: #000040;">*</span><span style="color: #008000;">&#40;</span>InputSource<span style="color: #000040;">*</span><span style="color: #008000;">&#41;</span>pDataSrc<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
&nbsp;
    <span style="color: #666666;">// Get the root node</span>
    DOMDocument<span style="color: #000040;">*</span> pDoc <span style="color: #000080;">=</span> parser.<span style="color: #007788;">getDocument</span> <span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
&nbsp;
    <span style="color: #666666;">//...</span>
&nbsp;
    <span style="color: #666666;">// Free memory.</span>
    <span style="color: #0000dd;">delete</span> pDataSrc<span style="color: #008080;">;</span>
    <span style="color: #0000ff;">return</span> <span style="color: #0000ff;">true</span><span style="color: #008080;">;</span>
<span style="color: #008000;">&#125;</span></pre></div></div>

<p>I was especially lazy in that last listing actually, because I told Xerces to <em>adopt</em> my buffer which means it&#8217;ll free it for me when it&#8217;s finished with it. Curiously though, even a memory object needs a <em>system id</em> which is the purpose <span style="font-family: Courier New;">L&#8221;TwitterXML&#8221;</span> serves. With a <a href="http://xerces.apache.org/xerces-c/apiDocs-2/classDOMDocument.html" target="_blank">DOMDocument</a> in memory it was trivial to add the statuses to a list box.</p>
<p><a href="http://www.kierdugan.com/wp/wp-content/uploads/2010/03/XercesTest.jpg"><img class="aligncenter size-medium wp-image-59" title="XercesTest" src="http://www.kierdugan.com/wp/wp-content/uploads/2010/03/XercesTest-300x185.jpg" alt="" width="300" height="185" /></a></p>
<p>I was quite surprised at how complex a task I&#8217;d achieved given the effort I&#8217;d put in; hats off to both Xerces and libcurl. Now that I&#8217;d managed to list my tweets, naturally the next step is to try and submit one! So I made a new dialog for the occasion:</p>
<p><a href="http://www.kierdugan.com/wp/wp-content/uploads/2010/03/XercesTestPost.jpg"><img class="aligncenter size-full wp-image-60" title="XercesTestPost" src="http://www.kierdugan.com/wp/wp-content/uploads/2010/03/XercesTestPost.jpg" alt="" width="210" height="226" /></a></p>
<p>Clicking OK causes some magic to happen:</p>

<div class="wp_syntax"><div class="code"><pre class="cpp" style="font-family:monospace;"><span style="color: #0000ff;">int</span> CPostStatusDlg<span style="color: #008080;">::</span><span style="color: #007788;">DoStatusUpdate</span> <span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span>
<span style="color: #008000;">&#123;</span>
    <span style="color: #0000ff;">static</span> <span style="color: #0000ff;">const</span> <span style="color: #0000ff;">char</span><span style="color: #000040;">*</span> cszUrl <span style="color: #000080;">=</span>
        <span style="color: #FF0000;">&quot;http://api.twitter.com/1/statuses/update.xml&quot;</span><span style="color: #008080;">;</span>
&nbsp;
    <span style="color: #666666;">// Attempt to initialise cURL.</span>
    CURL<span style="color: #000040;">*</span> curl <span style="color: #000080;">=</span> curl_easy_init <span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
    <span style="color: #0000ff;">if</span> <span style="color: #008000;">&#40;</span>curl <span style="color: #000080;">==</span> <span style="color: #0000ff;">NULL</span><span style="color: #008000;">&#41;</span>
        <span style="color: #0000ff;">return</span> <span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>
&nbsp;
    <span style="color: #666666;">// Configure the authentication</span>
    CString strFmt<span style="color: #008080;">;</span>
    strFmt.<span style="color: #007788;">Format</span> <span style="color: #008000;">&#40;</span>_T<span style="color: #008000;">&#40;</span><span style="color: #FF0000;">&quot;%s:%s&quot;</span><span style="color: #008000;">&#41;</span>, m_strUserName, m_strPassword<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
    curl_easy_setopt <span style="color: #008000;">&#40;</span>curl, CURLOPT_HTTPAUTH, CURLAUTH_BASIC<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
    curl_easy_setopt <span style="color: #008000;">&#40;</span>curl, CURLOPT_USERPWD,  <span style="color: #008000;">&#40;</span><span style="color: #0000ff;">char</span><span style="color: #000040;">*</span><span style="color: #008000;">&#41;</span>strFmt.<span style="color: #007788;">GetString</span> <span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
&nbsp;
    <span style="color: #666666;">// Format the string entire in C form.</span>
    <span style="color: #0000ff;">char</span><span style="color: #000040;">*</span> szStatus <span style="color: #000080;">=</span> curl_easy_escape <span style="color: #008000;">&#40;</span>curl, m_strStatus.<span style="color: #007788;">GetString</span> <span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span>,
        m_strStatus.<span style="color: #007788;">GetLength</span> <span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
    <span style="color: #0000ff;">char</span> szPostBody<span style="color: #008000;">&#91;</span><span style="color: #0000ff;">BUFSIZ</span><span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span>
    <span style="color: #0000dd;">sprintf</span> <span style="color: #008000;">&#40;</span>szPostBody, <span style="color: #FF0000;">&quot;status=%s&quot;</span>, szStatus<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
    curl_free <span style="color: #008000;">&#40;</span>szStatus<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
&nbsp;
    <span style="color: #666666;">// Set up the HTTP connection and use the POST method</span>
    curl_easy_setopt <span style="color: #008000;">&#40;</span>curl, CURLOPT_POST,           <span style="color: #0000dd;">1L</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
    curl_easy_setopt <span style="color: #008000;">&#40;</span>curl, CURLOPT_POSTFIELDSIZE,  <span style="color: #0000dd;">strlen</span> <span style="color: #008000;">&#40;</span>szPostBody<span style="color: #008000;">&#41;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
    curl_easy_setopt <span style="color: #008000;">&#40;</span>curl, CURLOPT_POSTFIELDS,     szPostBody<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
&nbsp;
    <span style="color: #666666;">// Finally, set the callback function and the URL.</span>
    CMemFile buffer<span style="color: #008080;">;</span>
    curl_easy_setopt <span style="color: #008000;">&#40;</span>curl, CURLOPT_WRITEFUNCTION, _CurlWriteCB<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
    curl_easy_setopt <span style="color: #008000;">&#40;</span>curl, CURLOPT_WRITEDATA,     <span style="color: #000040;">&amp;</span>buffer<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
    curl_easy_setopt <span style="color: #008000;">&#40;</span>curl, CURLOPT_URL,           cszUrl<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
&nbsp;
    <span style="color: #666666;">// Now we can execute at last!</span>
    <span style="color: #0000ff;">int</span> nResponse <span style="color: #000080;">=</span> <span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>
    CURLcode res <span style="color: #000080;">=</span> curl_easy_perform <span style="color: #008000;">&#40;</span>curl<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
    curl_easy_getinfo <span style="color: #008000;">&#40;</span>curl, CURLINFO_RESPONSE_CODE, <span style="color: #000040;">&amp;</span>nResponse<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
    curl_easy_cleanup <span style="color: #008000;">&#40;</span>curl<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
&nbsp;
    <span style="color: #666666;">// Check for success</span>
    <span style="color: #0000ff;">return</span> nResponse<span style="color: #008080;">;</span>
<span style="color: #008000;">&#125;</span></pre></div></div>

<p>Boom! Tweet submitted!</p>
<p>In the above listing, the growable buffer and the callback are largely to just eat the output from libcurl because we don&#8217;t really care about it. CMemFile will free the memory it allocated when the function returns too, which saves hassle. I originally wrote all the code listings with Unicode in mind which is why they might appear to be a bit odd. libcurl is an ANSI C library so you may need to convert your strings for it to work. Thankfully Xerces includes <a href="http://xerces.apache.org/xerces-c/apiDocs-2/classXMLString.html" target="_blank">some basic support</a> because it uses Unicode internally.</p>
<p>Little victory indeed.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kierdugan.com/2010/03/25/twitter-me-xerces/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

