<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>holehouse.org</title>
	<atom:link href="http://www.holehouse.org/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.holehouse.org</link>
	<description></description>
	<lastBuildDate>Fri, 14 Jun 2013 16:41:31 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>Encryption as a flag</title>
		<link>http://www.holehouse.org/thoughts/encryption-as-a-flag/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=encryption-as-a-flag</link>
		<comments>http://www.holehouse.org/thoughts/encryption-as-a-flag/#comments</comments>
		<pubDate>Fri, 14 Jun 2013 16:41:31 +0000</pubDate>
		<dc:creator>alex</dc:creator>
				<category><![CDATA[Thoughts]]></category>
		<category><![CDATA[Big data]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[encryption]]></category>
		<category><![CDATA[NSA]]></category>
		<category><![CDATA[PRISM]]></category>

		<guid isPermaLink="false">http://www.holehouse.org/?p=629</guid>
		<description><![CDATA[There are various posts appearing online, suggesting that given the NSA may have direct (I refuse to treat an SFTP server, or basically anything where people aren&#8217;t involved, as not direct) access to email content, you should encrypt that content and take your privacy back. I&#8217;m not sure this is the best approach. The volume [...]]]></description>
				<content:encoded><![CDATA[<p>There are various posts appearing online, suggesting that given the NSA may have direct (I refuse to treat an SFTP server, or basically anything where people aren&#8217;t involved, as not direct) access to email content, you should encrypt that content and take your privacy back.</p>
<p>I&#8217;m not sure this is the best approach. The volume of data the NSA may or may not have access to every day, if they are able to access email, phone records etc, would be astronomical. We (the technological and scientific community) often find it hard to extract meaning from  massive well structured datasets. How is the NSA expected to systematically derive relevant information from as fragmented, diverse and unstructured data as personal communications? If I worked for the NSA, I&#8217;d select a number of robust digital <em>targets </em>or <em>flags</em> to look for inside this information. One of those targets would almost certainly be encrypted emails. You can use fairly simple machine learning techniques (Bayesian filters come to mind) to pick key-encrypted text out from normal human readable text.</p>
<p>If a nefarious individual were planning something criminal and using email at all, I would expect them to encrypt or disguise the content in some way. Such an individual would have to assume all communications were compromised  - just because we, the public, have not thought this until now does not mean that any organization participating in illegal activities does not already make this assumption. It&#8217;s probably also fair to say that, until last week, very few people encrypted the contents of their emails.</p>
<p>Considering this, if I worked for the NSA, I would probably think, &#8220;Well &#8211; this is one, easy way to drastically reduces the set of data we have to examine&#8221; and to essentially treat encrypted emails as a flag. If programs like PRISM are as far reaching as they&#8217;re claimed to be, then email is one dimension of a network of data which can be built around a potential target. Encrypting your email may essentially attract the very attention its trying to avoid.</p>
<p>One  solution to this, is to use old-school codes and cyphers. Like Enid Blyton&#8217;s youthful heroes and heroines, or every spy kit given to every budding childhood spy, secret messages encoded within apparently benign or routine notes may be a far less conspicuous way to sneak secrets through the tubes. Of course, encrypted or coded messages are largely uninformative without other pointers. I suspect this is the precise reason why PRISM is reported to be so massively encroaching &#8211; simply put, using just email  or cellular metadata alone does not provide even close to the granularity or certainty needed to make informed analyses of potential terror threats. That being said, how the NSA would deal with such a vast, massively complex, semi-overlapping, semi-complementary data set is, theoretically at least, an incredibly interesting problem.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.holehouse.org/thoughts/encryption-as-a-flag/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>klean &#8211; easy sequence copy-paste</title>
		<link>http://www.holehouse.org/programming/klean-easy-sequence-copy-paste/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=klean-easy-sequence-copy-paste</link>
		<comments>http://www.holehouse.org/programming/klean-easy-sequence-copy-paste/#comments</comments>
		<pubDate>Thu, 17 Jan 2013 15:28:33 +0000</pubDate>
		<dc:creator>alex</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Hacks]]></category>
		<category><![CDATA[Terminal]]></category>

		<guid isPermaLink="false">http://www.holehouse.org/?p=615</guid>
		<description><![CDATA[I&#8217;m forever copying protein or DNA sequences from online resources to other places. The problem is, online resources display these sequences as split line format with spaces, which is great for viewing, but not so good if you want to manipulate it as a string. Say I wanted to do something with the p53 protein [...]]]></description>
				<content:encoded><![CDATA[<p>I&#8217;m forever copying protein or DNA sequences from online resources to other places. The problem is, online resources display these sequences as split line format with spaces, which is great for viewing, but not so good if you want to manipulate it as a string.</p>
<p>Say I wanted to do something with the p53 protein sequence. My general approach would be to;</p>
<ul>
<li>Go to UniProt (<a href="http://www.uniprot.org/uniprot/P04637.fasta">http://www.uniprot.org</a>)</li>
<li>Search for p53</li>
<li>Open the relevant record (p53 in Humans)</li>
<li>Open the sequence in .fasta (<a href="http://www.uniprot.org/uniprot/P04637.fasta">http://www.uniprot.org/uniprot/P04637.fasta</a>)</li>
</ul>
<p>Now I have a problem &#8211; I don&#8217;t want all these line breaks in my sequence &#8211; I just want it to be a single string. What I used to do was manually delete the line breaks, but this is a drag. It&#8217;s not hard, takes maybe 20-30 seconds, but feels very lo-fi.</p>
<p>To get around this I wrote <strong><a href="https://github.com/alexholehouse/biotools">klean</a> </strong>(one of my biotools scripts). A command line utility which takes a string as an argument, and copies the string without any spaces, tabs, new lines etc to your middle button clipboard for easy manipulation (the CTRL-V clipboard is left untouched).</p>
<p>For example<br />
<code>klean "MEEPQSDPSVEPPLSQETFSDLWKLLPENNVLSPLPSQAMDDLMLSPDDIEQWFTEDPGP<br />
DEAPRMPEAAPPVAPAPAAPTPAAPAPAPSWPLSSSVPSQKTYQGSYGFRLGFLHSGTAK<br />
SVTCTYSPALNKMFCQLAKTCPVQLWVDSTPPPGTRVRAMAIYKQSQHMTEVVRRCPHHE<br />
RCSDSDGLAPPQHLIRVEGNLRVEYLDDRNTFRHSVVVPYEPPEVGSDCTTIHYNYMCNS<br />
SCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTEEENLRKKGEPHHELP<br />
PGSTKRALPNNTSSSPQPKKKPLDGEYFTLQIRGRERFEMFRELNEALELKDAQAGKEPG<br />
GSRAHSSHLKSKKGQSTSRHKKLMFKTEGPDSD"</code></p>
<p>Will copy this whole badboy to my clipboard as a single string, so when I press the middle mouse button it pastes it wherever I want.</p>
<p>As with any executable script, to make it system wide either put it in your <code>[/usr/bin]</code> directory, or any other directory in your PATH (see <code>[echo $PATH]</code> in bash/zsh for these locations). Simple, totally unnecessary  but has significantly improved the last few days!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.holehouse.org/programming/klean-easy-sequence-copy-paste/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Casting to zero</title>
		<link>http://www.holehouse.org/thoughts/casting-to-zero/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=casting-to-zero</link>
		<comments>http://www.holehouse.org/thoughts/casting-to-zero/#comments</comments>
		<pubDate>Sun, 13 Jan 2013 07:02:26 +0000</pubDate>
		<dc:creator>alex</dc:creator>
				<category><![CDATA[Thoughts]]></category>

		<guid isPermaLink="false">http://www.holehouse.org/?p=611</guid>
		<description><![CDATA[Aaron Swartz&#8217;s recent suicide has reminded me of many things. From a purely selfish point of view, I can say with absolute certainty I&#8217;ll never have the privileged of interacting with a  guy who, by all accounts, operated on an intellectual and functional level far, far above normal people. Somewhat more importantly though, is the  reminder that depression is [...]]]></description>
				<content:encoded><![CDATA[<p>Aaron Swartz&#8217;s <strong><a href="http://www.washingtonpost.com/blogs/wonkblog/wp/2013/01/12/aaron-swartz-american-hero/">recent suicide</a></strong> has reminded me of many things.</p>
<p>From a purely selfish point of view, I can say with absolute certainty I&#8217;ll never have the privileged of interacting with a  guy who, by all accounts, operated on an intellectual and functional level far, far above normal people. Somewhat more importantly though, is the  reminder that depression is not something people can control. It&#8217;s not something you can rationalize, or defeat using a logical approach. If it could have been &#8220;solved&#8221;, as one might solve a technical problem, I have no doubt that Mr. Swartz would have solved it.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.holehouse.org/thoughts/casting-to-zero/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Scientific computing best practice</title>
		<link>http://www.holehouse.org/programming/programming-for-scientists/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=programming-for-scientists</link>
		<comments>http://www.holehouse.org/programming/programming-for-scientists/#comments</comments>
		<pubDate>Wed, 28 Nov 2012 14:25:55 +0000</pubDate>
		<dc:creator>alex</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.holehouse.org/?p=599</guid>
		<description><![CDATA[As someone who uses computational approaches to solve biologically relevant problems, one of my primary motivations for doing a masters in (pure) computer science was to ensure that the software I developed in the future would actually be good. Clearly, &#8220;good&#8221; is somewhat subjective, but by good I mean more than functional. I mean; Understandable [...]]]></description>
				<content:encoded><![CDATA[<p>As someone who uses computational approaches to solve biologically relevant problems, one of my primary motivations for doing a masters in (pure) computer science was to ensure that the software I developed in the future would actually be good. Clearly, &#8220;good&#8221; is somewhat subjective, but by good I mean more than functional. I mean;</p>
<ul>
<li>Understandable (and readable)</li>
<li>Maintainable</li>
<li>Testable</li>
<li>Extendable</li>
</ul>
<p>Which are four things I think software written by non-computer scientists often lacks, to various degrees. <em>Clearly</em> this is not always the case, and I am not for a second saying that non-computer scientists are unable to write good code, but if you&#8217;re thrust into an environment where your only goal is to solve a problem, these best practices often slip into the background compared to issues like, &#8220;<em>what is an environmental variable?</em>&#8220;, &#8220;<em>why doesn&#8217;t this compile?</em>&#8221; or &#8220;<em>how the hell do I get out of VIM?</em>&#8220;[1].</p>
<p>More generally, if no one teaches you best practice, how are you meant to learn? Often people will develop a sense over time, but for that first year or so you may not even know you&#8217;re missing crucial ideas or approaches. &#8220;<em>Version control &#8211; sure, I save incremental file names like &#8216;mycode_1.pl&#8217;, &#8216;mycode_2.pl</em>&#8216;&#8221;.</p>
<p><strong><a href="http://arxiv.org/pdf/1210.0530v2.pdf">A recent article</a> </strong>aims to address these issues by summarizing scientific computing best practices. It&#8217;s short, well written and to the point, but encompasses a huge number of ideas crucial for software development. <em>HIGHLY </em>recommended reading for anyone involved in writing even the most mundane scripts. Read it for your collaborators. Read it for future students and postdocs. But most of all, read it for future you, because there is no greater pain than not understanding something you <em>know </em>you once understood, because you wrote it.</p>
<p>[1] &#8211; Full disclosure, I still get trapped in VIM from time to time.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.holehouse.org/programming/programming-for-scientists/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Geeneus 0.1.2 released</title>
		<link>http://www.holehouse.org/projects/coding/geeneus-0-1-2-out/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=geeneus-0-1-2-out</link>
		<comments>http://www.holehouse.org/projects/coding/geeneus-0-1-2-out/#comments</comments>
		<pubDate>Mon, 29 Oct 2012 07:55:50 +0000</pubDate>
		<dc:creator>alex</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[API]]></category>
		<category><![CDATA[NCBI]]></category>
		<category><![CDATA[PhD]]></category>

		<guid isPermaLink="false">http://www.holehouse.org/?p=590</guid>
		<description><![CDATA[The current stable alpha (there&#8217;s an oxymoron) version of Geeneus is now available through the Python package index. For more information have a look at the entry or at the associated github page. This is the culmination of a lot of work, but I&#8217;m happy with the result, especially how I&#8217;ve managed to deal with the Networking.py [...]]]></description>
				<content:encoded><![CDATA[<p>The current stable alpha (there&#8217;s an oxymoron) version of Geeneus is now available through the Python package index. For more information <strong><a href="http://pypi.python.org/pypi/Geeneus/0.1.2" target="_blank">have a look at the entry</a></strong> or at the associated <a href="http://rednaxela.github.com/Geeneus/" target="_blank"><strong>github</strong> <strong>page</strong></a>. This is the culmination of a lot of work, but I&#8217;m happy with the result, especially how I&#8217;ve managed to deal with the Networking.py class which has been reduced to just 230 LOC, about half of which are actually comments of spacers. With this code we get literally 100% of all the network calls and error handling, which makes maintenance a lot easier. Functional programming for the win!</p>
<p>I&#8217;m currently using it to build what will end up being a 100 000 row table, using about 2000 batch queries, and so far it&#8217;s going well [ <strong>UPDATE</strong> - Table built successfully over a weekend]. I&#8217;m <em>really</em> pleased with how it handles bad accession values in batch mode - responsiveness is fast and we find error causing entries very quickly. Also, being able to install globally with a single command is <em>awesome</em>. Thank you <code>pip</code>!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.holehouse.org/projects/coding/geeneus-0-1-2-out/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>SQLAlchemy and MySQL &#8211; unsigned foreign keys</title>
		<link>http://www.holehouse.org/uncategorized/sqlalchemy-and-mysql-unsigned-foreign-keys/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=sqlalchemy-and-mysql-unsigned-foreign-keys</link>
		<comments>http://www.holehouse.org/uncategorized/sqlalchemy-and-mysql-unsigned-foreign-keys/#comments</comments>
		<pubDate>Wed, 26 Sep 2012 23:17:24 +0000</pubDate>
		<dc:creator>alex</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.holehouse.org/?p=563</guid>
		<description><![CDATA[I&#8217;m currently working on a back-end database and analysis platform. Our data is stored in an SQL database, but the analysis code is in Python, so we use SQLAlchemy to talk to the database using its ORM layer. First off, SQLAlchemy is pretty much alchemy &#8211; there&#8217;s a bit of a learning curve, but it&#8217;s very [...]]]></description>
				<content:encoded><![CDATA[<p>I&#8217;m currently working on a back-end database and analysis platform. Our data is stored in an SQL database, but the analysis code is in Python, so we use SQLAlchemy to talk to the database using its ORM layer. First off, SQLAlchemy <em>is</em> pretty much alchemy &#8211; there&#8217;s a bit of a learning curve, but it&#8217;s very easy to use, and totally abstracts the data-warehousing away from your code. We&#8217;re back-porting SQLAlchemy functionality onto an existing and running database, and in doing so I&#8217;ve found a bit of a quirk regarding how SQLAlchemy interacts with a MySQL database.</p>
<p>I want to use the ORM create a new table in the database which has a foreign key in one of the existing tables &#8211; not an unreasonable request. The database is already setup, and a <code>desc;</code> when logged in through the MySQL monitor of the hopeful parent table gives the following output;</p>
<p>&nbsp;</p>
<script src="https://gist.github.com/bca212e022a8e7b5235b.js"></script><noscript><pre><code class="language-sql sql">+------------+------------------+------+-----+---------+----------------+
| Field      | Type             | Null | Key | Default | Extra          |
+------------+------------------+------+-----+---------+----------------+
| id         | int(10) unsigned | NO   | PRI | NULL    | auto_increment |
| pqid       | text             | NO   |     | NULL    |                |
| texid      | int(10) unsigned | NO   |     | NULL    |                |
| toll       | varchar(30)      | YES  |     | NULL    |                |
| name       | varchar(100)     | NO   |     |         |                |
| dateval    | varchar(7)       | NO   |     |         |                |
+------------+------------------+------+-----+---------+----------------+
</code></pre></noscript>
<p>&nbsp;</p>
<p>So, looking at this output we can define the following class for the ORM;</p>
<p>&nbsp;</p>
<script src="https://gist.github.com/2b979c3021f937d89b68.js"></script><noscript><pre><code class="language-python python">class EGTable(Base):
    __tablename__='egtable'
    id = Column(Integer(10), primary_key=True, autoincrement=True)
    pqid = Column(TEXT)
    texid = Column(Integer(10))
    roll = Column(VARCHAR(30))
    name = Column(VARCHAR(100))
    datevar = Column(VARCHAR(7))
</code></pre></noscript>
<p>&nbsp;</p>
<p>Using this class I can connect to the database and query through using the ORM without a hassle. For example, below I connect to the database, and then I query it for the first item in the table <code>egtable</code>;</p>
<p>&nbsp;</p>
<script src="https://gist.github.com/fc77bdc77092c8550aa6.js"></script><noscript><pre><code class="language-python python">from sqlalchemy import *
from sqlalchemy.orm import sessionmaker

import orm_classfile # file which contains our class defintion above

engine = create_engine('mysql://&lt;user&gt;:&lt;password&gt;@localhost/&lt;database&gt;?charset=utf8&amp;use_unicode=0', pool_recycle=3600)
Session = sessionmaker(bind=engine)
session = Session()

# print out the pqid field from the first result returned
# from the EGTable
print session.query(orm_classfile.EGTable).first().pqid</code></pre></noscript>
<p>&nbsp;</p>
<p>However, if we try and <strong>create</strong> a table using this same syntax/style, we hit a problem. Look at this class &#8211; we should be able to create a table from this, right?</p>
<p>&nbsp;</p>
<script src="https://gist.github.com/584f2d232b033cc0ef64.js"></script><noscript><pre><code class="language-python python">class OtherTable(Base):
    __tablename__='linktable'
    id = Column(Integer(10), primary_key=True, autoincrement=True)
    test = Integer(5)
    protein_id = Column(Integer(10), ForeignKey('egtable.id'))
    protein = relationship(&quot;EGTable&quot;)</code></pre></noscript>
<p>&nbsp;</p>
<p>We only have three columns, and we define the foreign key in exactly the same way as above (i.e. an Integer of width 10). Given we can read the database using this ORM, surely that means it&#8217;s a valid way of describing the data? Well no &#8211; this fails. It fails with a 150 error, which is a foreign key error.</p>
<p>&nbsp;</p>
<script src="https://gist.github.com/7a6a2d2f3a0516db6f69.js"></script><noscript><pre><code class="language-python traceback python traceback">Traceback (most recent call last):
  File &quot;&lt;stdin&gt;&quot;, line 1, in &lt;module&gt;
  File &quot;build/bdist.linux-i686/egg/sqlalchemy/schema.py&quot;, line 2564, in create_all
  File &quot;build/bdist.linux-i686/egg/sqlalchemy/engine/base.py&quot;, line 2303, in _run_visitor
  ...
  ...
  File &quot;build/bdist.linux-i686/egg/sqlalchemy/engine/default.py&quot;, line 331, in do_execute
  File &quot;/usr/lib/python2.7/dist-packages/MySQLdb/cursors.py&quot;, line 174, in execute
    self.errorhandler(self, exc, value)
  File &quot;/usr/lib/python2.7/dist-packages/MySQLdb/connections.py&quot;, line 36, in defaulterrorhandler
    raise errorclass, errorvalue
sqlalchemy.exc.OperationalError: (OperationalError) (1005, &quot;Can't create table 'database_name.linktable' (errno: 150)&quot;)</code></pre></noscript>
<p>&nbsp;</p>
<p>Why? Because with MySQL the signs of foreign keys have to match, but the SQL specification doesn&#8217;t have an unsigned integer. To get around this, you can import in the MySQL Integer dialect via</p>
<p>&nbsp;</p>
<p><code>from sqlalchemy.dialects.mysql import INTEGER as Integer</code></p>
<div></div>
<div></div>
<div>and then define the foreign key as unsigned</div>
<div></div>
<div><script src="https://gist.github.com/b47a3fdf0e0fabad8cea.js"></script><noscript><pre><code class="language-python python">class OtherTable(Base):
    __tablename__='linktable'
    id = Column(Integer(10), primary_key=True, autoincrement=True)
    test = Integer(5)
    protein_id = Column(Integer(unsigned=True, width=20), ForeignKey('egtable.id'))
    protein = relationship(&quot;EGTable&quot;)</code></pre></noscript></div>
<div></div>
<div>and hey presto, we can now create this table (using <code>Base.metadata.createall(engine)</code>) where we&#8217;ve defined our declarative base and connected the engine to the database.</div>
]]></content:encoded>
			<wfw:commentRss>http://www.holehouse.org/uncategorized/sqlalchemy-and-mysql-unsigned-foreign-keys/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Lenovo U410 &#8211; no lock for you</title>
		<link>http://www.holehouse.org/thoughts/lenovo-u410-no-lock-for-you/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=lenovo-u410-no-lock-for-you</link>
		<comments>http://www.holehouse.org/thoughts/lenovo-u410-no-lock-for-you/#comments</comments>
		<pubDate>Tue, 25 Sep 2012 04:41:36 +0000</pubDate>
		<dc:creator>alex</dc:creator>
				<category><![CDATA[Thoughts]]></category>
		<category><![CDATA[Business]]></category>
		<category><![CDATA[Design]]></category>

		<guid isPermaLink="false">http://www.holehouse.org/?p=553</guid>
		<description><![CDATA[Before I bought my new Lenovo U410 IdeaPad (which, for the record, is disgustingly awesome) I quickly checked to see if it had a Kensington lock mount-point before I pressed, &#8220;buy&#8221; (ironically, after pressing, &#8220;buy&#8221; it was another four weeks before I actually got the laptop, but that&#8217;s another story&#8230;). I found a port-schematic on the Lenovo website, and glancing through [...]]]></description>
				<content:encoded><![CDATA[<p style="text-align: left;">Before I bought my new <a href="http://www.lenovo.com/products/us/laptop/ideapad/u-series/u410/">Lenovo U410 IdeaPad</a> (which, for the record, is disgustingly awesome) I quickly checked to see if it had a Kensington lock mount-point before I pressed, &#8220;buy&#8221; (ironically, after pressing, &#8220;buy&#8221; it was another four weeks before I actually got the laptop, but that&#8217;s another story&#8230;).</p>
<p style="text-align: left;">I found a port-schematic on the Lenovo website, and glancing through found the &#8220;Kensington Lock&#8221; item on the key. What I failed to notice, however, was that despite being on the key for the laptop the corresponding number &#8220;8&#8243; doesn&#8217;t actually appear on the laptop image. Instead, there are two &#8220;7&#8243;s.</p>
<p>&nbsp;</p>
<p style="text-align: center;"><a href="http://www.holehouse.org/wp-content/uploads/2012/09/lenovolol1.jpeg"><img class=" wp-image-555 aligncenter" title="Pwnd by Lenovo" src="http://www.holehouse.org/wp-content/uploads/2012/09/lenovolol1.jpeg" alt="" width="563" height="283" /></a></p>
<p>Well played Lenovo, you win this round&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.holehouse.org/thoughts/lenovo-u410-no-lock-for-you/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Julia &#8211; an exciting time for statistical programming</title>
		<link>http://www.holehouse.org/programming/julia-an-exciting-time-for-statistical-programming/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=julia-an-exciting-time-for-statistical-programming</link>
		<comments>http://www.holehouse.org/programming/julia-an-exciting-time-for-statistical-programming/#comments</comments>
		<pubDate>Sun, 06 May 2012 21:41:28 +0000</pubDate>
		<dc:creator>alex</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Julia]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[Statistical programming]]></category>

		<guid isPermaLink="false">http://www.holehouse.org/?p=480</guid>
		<description><![CDATA[On Tuesday last week I attended my first ever meetup in New York as a member of the Open Statistical Programming Meetup. This particular meetup was focusing on Julia, a new programming language I’d been interested in hearing more about since I read a post by John Myles White on the R-bloggers metablog titled, “Julia, I love [...]]]></description>
				<content:encoded><![CDATA[<p>On Tuesday last week I attended my first ever meetup in New York as a member of the <strong><a href="http://www.meetup.com/nyhackr/">Open Statistical Programming Meetup</a></strong>. This particular meetup was focusing on <strong><a href="http://julialang.org/">Julia</a></strong>, a new programming language I’d been interested in hearing more about since I read a post by John Myles White on the R-bloggers metablog titled, “<strong><a href="http://www.r-bloggers.com/julia-i-love-you/">Julia, I love you</a></strong>”. Since then I’d read bits and pieces about the language, but generally more conceptual stuff, rather than syntax or applications. Given that <strong><a href="http://karpinski.org/">Stefan Karpinski</a></strong>, one of the language’s creators was speaking, this seemed an ideal opportunity to get more information from the horse’s mouth, as well as see some demos of active projects (and see why, exactly, John loves Julia so much).</p>
<div id="attachment_496" class="wp-caption alignleft" style="width: 488px"><a href="http://www.holehouse.org/wp-content/uploads/2012/05/Julia.jpg"><img class=" wp-image-496 " title="Picture from the nyhackr twitter feed" src="http://www.holehouse.org/wp-content/uploads/2012/05/Julia.jpg" alt="" width="478" height="358" /></a><p class="wp-caption-text">Picture from the meetup &#8211; 10 points if you can spot me!</p></div>
<p>In a nutshell, Julia is a dynamic language which uses an LLVM-based aggressive JIT compiler for incredibly fast and flexible scientific computing. It allows high and low level programming in a single language, with the ability to easily call C, Python and Fortran libraries. While it doesn&#8217;t yet have a threading model, it allows parallel computing over a distributed system, meaning you can transparently send and receive information from distributed computers, making it ideal for big data, the buzzword of the moment. The initial, pre-release version was pushed out in February this year, and since then there has been a huge amount of work updating and building out the necessary support. Its development has been based on the following aims;</p>
<p style="padding-left: 30px;"><em>&#8220;We want a language that’s open source, with a liberal license. We want the speed of C with the dynamism of Ruby. We want a language that’s homoiconic, with true macros like Lisp, but with obvious, familiar mathematical notation like Matlab. We want something as usable for general programming as Python, as easy for statistics as R, as natural for string processing as Perl, as powerful for linear algebra as Matlab, as good at gluing programs together as the shell. Something that is dirt simple to learn, yet keeps the most serious hackers happy.&#8221;</em></p>
<p>Those are pretty lofty goals, but after seeing what Julia can do, I’d say Stefan and co are well on their way to meeting them. There were a few things during the talks which I felt really stood out.</p>
<ul>
<li><strong>Optional typing.</strong> This means you can define a variable as a specific type (in which case you impose the various [runtime] typing restrictions you expect) but you don’t have to. This option is awesome. It means you can write typesafe code where appropriate, and hack together something very quickly (as you might in Python or Perl) in a matter of minutes. The best thing, though, is that using or not using typing has no impact on performance, meaning its purely a design choice, not an optimization one. It also allows for a totally natural way to make calls to C functions, which is typically less syntactically clear in a totally untyped language. What typing does allow, however, is multiple dispatch, which allows for massive generalization of functions and code. With type checking occurring at runtime only, there&#8217;s no compile time type checking, which allows for a dependent type system which would otherwise be impossible (because a runtime type is only actually generated at runtime as it can use real values, not just types). This adds a lot of flexibility into how you develop your systems.</li>
</ul>
<ul>
<li><strong>Speed</strong>. A lot of Julia&#8217;s speed comes from a type inference algorithm which allows the specialization of code. This means that rather than generating generic methods which then type check, type-specific code is compiled, making the language incredibly fast. I guess it’s kind of obvious, but both the benchmarks Stefan showed and the feedback from the developers who presented was pretty impressive. <strong><a href="http://wesmckinney.com/blog/?p=475">Wes McKinney blogged </a></strong>about benchmarking some pretty trivial array operations more slowly than you’d expect given the benchmarks shown. Stefan’s response to this was good, and I’d hope that any of the hiccups or inconsistencies seen this early on are simply due to a mixture of immature libraries and people not doing things in the most efficient way. Fingers crossed compiler optimization can try and correct for the latter, and given the language’s age this is pretty understandable.</li>
</ul>
<table class="benchmarks">
<colgroup>
<col class="name" />
<col class="relative" span="6" /> </colgroup>
<caption>Benchmark times relative to C++ &#8211; taken directly from <strong><a href="http://julialang.org/">julialang.org</a></strong></caption>
<thead>
<tr>
<td></td>
<th class="system" style="text-align: center;">Julia</th>
<th class="system" style="text-align: center;">Python</th>
<th class="system" style="text-align: center;">Matlab</th>
<th class="system" style="text-align: center;">Octave</th>
<th class="system" style="text-align: center;">R</th>
<th class="system" style="text-align: center;">JavaScript</th>
</tr>
<tr>
<td></td>
<td class="version" style="text-align: center;">3f670da0</td>
<td class="version" style="text-align: center;">2.7.1</td>
<td class="version" style="text-align: center;">R2011a</td>
<td class="version" style="text-align: center;">3.4</td>
<td class="version" style="text-align: center;">2.14.2</td>
<td class="version" style="text-align: center;">V8 3.6.6.11</td>
</tr>
</thead>
<tbody>
<tr>
<th>fib</th>
<td class="data" style="text-align: center;">1.97</td>
<td class="data" style="text-align: center;">31.47</td>
<td class="data" style="text-align: center;">1336.37</td>
<td class="data" style="text-align: center;">2383.80</td>
<td class="data" style="text-align: center;">225.23</td>
<td class="data" style="text-align: center;">1.55</td>
</tr>
<tr>
<th>parse_int</th>
<td class="data" style="text-align: center;">1.44</td>
<td class="data" style="text-align: center;">16.50</td>
<td class="data" style="text-align: center;">815.19</td>
<td class="data" style="text-align: center;">6454.50</td>
<td class="data" style="text-align: center;">337.52</td>
<td class="data" style="text-align: center;">2.17</td>
</tr>
<tr>
<th>quicksort</th>
<td class="data" style="text-align: center;">1.49</td>
<td class="data" style="text-align: center;">55.84</td>
<td class="data" style="text-align: center;">132.71</td>
<td class="data" style="text-align: center;">3127.50</td>
<td class="data" style="text-align: center;">713.77</td>
<td class="data" style="text-align: center;">4.11</td>
</tr>
<tr>
<th>mandel</th>
<td class="data" style="text-align: center;">5.55</td>
<td class="data" style="text-align: center;">31.15</td>
<td class="data" style="text-align: center;">65.44</td>
<td class="data" style="text-align: center;">824.68</td>
<td class="data" style="text-align: center;">156.68</td>
<td class="data" style="text-align: center;">5.67</td>
</tr>
<tr>
<th>pi_sum</th>
<td class="data" style="text-align: center;">0.74</td>
<td class="data" style="text-align: center;">18.03</td>
<td class="data" style="text-align: center;">1.08</td>
<td class="data" style="text-align: center;">328.33</td>
<td class="data" style="text-align: center;">164.69</td>
<td class="data" style="text-align: center;">0.75</td>
</tr>
<tr>
<th>rand_mat_stat</th>
<td class="data" style="text-align: center;">3.37</td>
<td class="data" style="text-align: center;">39.34</td>
<td class="data" style="text-align: center;">11.64</td>
<td class="data" style="text-align: center;">54.54</td>
<td class="data" style="text-align: center;">22.07</td>
<td class="data" style="text-align: center;">8.12</td>
</tr>
<tr>
<th>rand_mat_mul</th>
<td class="data" style="text-align: center;">1.00</td>
<td class="data" style="text-align: center;">1.18</td>
<td class="data" style="text-align: center;">0.70</td>
<td class="data" style="text-align: center;">1.65</td>
<td class="data" style="text-align: center;">8.64</td>
<td class="data" style="text-align: center;">41.79</td>
</tr>
</tbody>
</table>
<p>An honorable mention should probably go the JS V8 engine, which is pretty speedy, although not very well suited for scientific computing.</p>
<ul>
<li><strong>Syntax and flexibility</strong>. Syntax is perhaps not the most crucial thing for seasoned programmers, but for new programmers having a syntax which basically looks like matlab + python is only a good thing &#8211; code looks like pseudocode at times (example below)</li>
</ul>
<div><script src="https://gist.github.com/2624061.js"></script><noscript><pre><code class="language-julia julia">function qsort!(a,lo,hi)
    i, j = lo, hi
    while i &lt; hi
        pivot = a[(lo+hi)&gt;&gt;&gt;1]
        while i &lt;= j
            while a[i] &lt; pivot; i = i+1; end
            while a[j] &gt; pivot; j = j-1; end
            if i &lt;= j
                a[i], a[j] = a[j], a[i]
                i, j = i+1, j-1
            end
        end
        if lo &lt; j; qsort!(a,lo,j); end
        lo, j = i, hi
    end
    return a
end</code></pre></noscript></div>
<p>Another thing I liked was the codes flexibility, where even the most unassuming expressions can trigger massive operations. As Stefan put it,</p>
<p style="text-align: center;"><em>“ (a+b) can do a single machine instruction or start up a cluster”</em></p>
<p>Despite all the awesome language features, one of the things I was most impressed with was the actual presentation. Trying to push a new language and make it appealing is not an easy task. Stefan’s presentation<strong>[1]</strong> was awesome – in my opinion it was pitched at precisely the right level, the mix of overview and code was great, and he dealt with questions exceptionally well. The presentations from <strong><a href="http://www.statalgo.com/">Shane Conway</a>[2]</strong> and <strong><a href="http://www.johnmyleswhite.com/">John</a>[3]</strong>, only backed up what Stefan had said, and indeed further showcased the languages power and simplicity, but in a totally non-fanboyish way.</p>
<p>Julia&#8217;s future looks pretty exciting too. Stefan said that there will hopefully be a package manager and a module system out in the next month or so. I think when this happens, we’ll start to see some of the general scientific packages ported over (something I’d be interested in doing). Additionally, we can expected improved performance and stability ahead of the 1.0 release.</p>
<p>For me, this is very much one to watch, and perhaps get involved in&#8230;</p>
<p>For those who missed it, Stefan and Jeff Bezanson (another of the core team) gave <strong><a href="http://channel9.msdn.com/Events/Lang-NEXT/Lang-NEXT-2012/Julia">a talk</a></strong> and <strong><a href="http://channel9.msdn.com/Blogs/Charles/Stefan-Karpinski-and-Jeff-Bezanson-Julia-Programming-Language">an interview</a> </strong>at the <strong><a href=" http://channel9.msdn.com/Events/Lang-NEXT/Lang-NEXT-2012">lang.NEXT</a></strong> conference this April.</p>
<p><strong>[1] <a href="http://files.meetup.com/1406240/julia.pdf">http://files.meetup.com/1406240/julia.pdf</a></strong><br />
<strong>[2] <a href="http://files.meetup.com/1406240/julia_lm.pdf">http://files.meetup.com/1406240/julia_lm.pdf</a></strong><br />
<strong>[3] <a href="https://github.com/johnmyleswhite/julia_nyhackr">https://github.com/johnmyleswhite/julia_nyhackr</a></strong></p>
]]></content:encoded>
			<wfw:commentRss>http://www.holehouse.org/programming/julia-an-exciting-time-for-statistical-programming/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Why does Google think I&#8217;m a Skyrim addict?</title>
		<link>http://www.holehouse.org/thoughts/why-does-google-think-im-a-skyrim-addict/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=why-does-google-think-im-a-skyrim-addict</link>
		<comments>http://www.holehouse.org/thoughts/why-does-google-think-im-a-skyrim-addict/#comments</comments>
		<pubDate>Wed, 28 Mar 2012 14:53:39 +0000</pubDate>
		<dc:creator>alex</dc:creator>
				<category><![CDATA[Thoughts]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Search]]></category>

		<guid isPermaLink="false">http://www.holehouse.org/?p=468</guid>
		<description><![CDATA[I have never played Skyrim. Nor have I ever searched for it. I&#8217;ve watched one video about the game which a friend posted on Facebook (worth your time), and mentioned on a Facebook thread abstractly how I&#8217;d like to play it at some point, but other than that I&#8217;ve had no interaction with the game, [...]]]></description>
				<content:encoded><![CDATA[<p>I have never played <strong><a href="http://www.elderscrolls.com/skyrim/">Skyrim</a></strong>. Nor have I ever searched for it. I&#8217;ve watched one video about the game which a friend posted on Facebook (<strong><a href="http://www.youtube.com/watch?v=sTgUm8VEWiU">worth your time</a></strong>), and mentioned on a Facebook thread abstractly how I&#8217;d like to play it at some point, but other than that I&#8217;ve had no interaction with the game, nor expressed any interest in it.</p>
<p>Despite this, Google search autocomplete seems to think I&#8217;m a Skyrim fiend.</p>
<p><a href="http://www.holehouse.org/wp-content/uploads/2012/03/google1.png"><img class="aligncenter size-large wp-image-469" title="I will find ALL the answers" src="http://www.holehouse.org/wp-content/uploads/2012/03/google1-1024x327.png" alt="" width="584" height="186" /></a></p>
<p>I checked out my add preferences to see if that was the issue &#8211; nothing about video games there at all, which makes sense as I&#8217;m not really a gamer any more. To explore a little further I signed out of Google, and opened IE9. No difference.</p>
<p><a href="http://www.holehouse.org/wp-content/uploads/2012/03/google2.png"><img class="aligncenter size-large wp-image-471" title="More answers please" src="http://www.holehouse.org/wp-content/uploads/2012/03/google2-1024x498.png" alt="" width="584" height="284" /></a></p>
<p>Next I VPNed to the UK (I live in NYC) and tried searching (again through IE9 while logged out of the big G). This was interesting &#8211; I didn&#8217;t get exactly the same results, and in fact my &#8220;What is the best&#8230;&#8221; prompt failed to generate any Skyrim related queries. Despite this, other question openers generated those same, Skyrim-centeric questions</p>
<p><a href="http://www.holehouse.org/wp-content/uploads/2012/03/google3.png"><img class="aligncenter size-large wp-image-472" title="More British searches, but still with an overwhelming Skyrim theme" src="http://www.holehouse.org/wp-content/uploads/2012/03/google3-1024x293.png" alt="" width="584" height="167" /></a></p>
<p>Am I the only person experiencing these results? Does Google know what I want before I do? Is the algorithm broken?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.holehouse.org/thoughts/why-does-google-think-im-a-skyrim-addict/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A can of sea monsters</title>
		<link>http://www.holehouse.org/thoughts/a-can-of-sea-monsters/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=a-can-of-sea-monsters</link>
		<comments>http://www.holehouse.org/thoughts/a-can-of-sea-monsters/#comments</comments>
		<pubDate>Wed, 08 Feb 2012 19:44:02 +0000</pubDate>
		<dc:creator>alex</dc:creator>
				<category><![CDATA[Thoughts]]></category>
		<category><![CDATA[Evolution]]></category>
		<category><![CDATA[Science]]></category>

		<guid isPermaLink="false">http://www.holehouse.org/?p=460</guid>
		<description><![CDATA[On the 5th of February, Russian researchers announced they had finally managed to make contact with Lake Vostok&#8217;s body of water. It&#8217;s no exaggeration for me to say I&#8217;ve been waiting half my lifetime for this. Ever since finding out about Vostok in 2000, I&#8217;ve been fascinated by the challenges and mystery associated with what is effectively another planet on the [...]]]></description>
				<content:encoded><![CDATA[<p>On the 5th of February, Russian researchers announced they had finally managed to <strong><a href="http://en.ria.ru/society/20120208/171219060.html">make contact with Lake Vostok&#8217;s body of water</a></strong>. It&#8217;s no exaggeration for me to say I&#8217;ve been waiting half my lifetime for this. Ever since finding out about Vostok in 2000, I&#8217;ve been fascinated by the challenges and mystery associated with what is effectively another planet on the surface of ours. Lake Vostok is a fresh water lake 4km beneath the Antarctic surface. It&#8217;s believed to have formed through a mixture of glacial ice melting caused by the earths warmth and the extreme pressure from the weight of ice above. The most startling thing about the lake, however, it&#8217;s magnitude &#8211; it&#8217;s about the same size as Wales.</p>
<p><a href="http://www.holehouse.org/wp-content/uploads/2012/02/vostok.jpg"><img class="aligncenter size-medium wp-image-461" title="Lake Vostok" src="http://www.holehouse.org/wp-content/uploads/2012/02/vostok-300x189.jpg" alt="" width="300" height="189" /></a></p>
<p>For the last 14 million years, the world has been evolving with (largely) the same selective pressures, and, most importantly, with the potential for biological exchange between different habitats. Obviously, this exchange is often dependent on geography &#8211; hence why Australia has some strange animals not seen elsewhere, but even in Australia you have things like wind, rain, sun, seasons etc. The &#8220;general&#8221;  macro-environmental selective pressures are the same. In the oceans you don&#8217;t have this geographical separation &#8211; a fish can swim from the Pacific to the Atlantic, and can swim from the sea bed to the surface. Whether or not the modern day versions of the species we see actually make this kind of migration is another (somewhat irrelevant) question, but the fact it&#8217;s possible has a lot of implications for how those water-dwelling creatures have evolved. Lakes are a little different, but typically big lakes are connected to other lakes and the oceans by rivers. They are still part of this &#8220;surface&#8221; habitat, which has sun, changing water levels, the impact of land animals and birds, seasons, wind, etc.</p>
<p>Lake Vostoc has neither the same selective pressures, nor biological exchange. There has probably (I&#8217;m always uneasy with absolutes) been no biological exchange for 14 million years with the outside world. On top of that, it is an environment <em>so </em>foreign compared with anywhere else on the planet. There&#8217;s no light, so the food chain must be based on chemosynthetic organisms (as opposed to photosynthetic ones &#8211; plants and algae). The water is cold, estimated at -3 degrees Centigrade, but kept liquid by the lakes high pressure. There are no seasons, no lake &#8220;surface&#8221; as the lake&#8217;s interface is either rock or ice, although there are small tides. Finally, the level of oxygen and nitrogen  in the lake is 50 times higher than in normal freshwater lakes.</p>
<p>Needless to say, it&#8217;s quite a unique environment.</p>
<p>You might well assume there&#8217;ll be nothing down there, and any life found would be highly inactive to try and conserve the minimal energy it has access to. Those as widely optimistic as myself would disagree. When scientists explored similar, but much smaller caves that had been sealed off from the rest of the world in Romania, they found over thirty new species, where the largest were related to the size of their environment (i.e. they could still move around freely). These species were highly active, and to some extent resembled modern insects but had many features that were totally novel. Granted, these caves were open air (as opposed to water), but water provides a more stable environment for growth and evolution. Even if there is no &#8220;large&#8221; life (I&#8217;m hoping for mermaids, but that&#8217;s probably too much to ask for) the micro-organisms that exist could be truly extraordinary.</p>
<p>In a lake the size of Lake Ontario, but twice as deep in places, the possibility of new species, and even new ecosystems within the lake is truly incredible. More excitingly, for the biochemist in me, is the potential for totally new routes of evolution &#8211; new systems for energy generation, new skeletal formation to deal with the pressure, new circulatory systems to take advantage of the high oxygen concentration. Who knows &#8211; while carbon is readily available on the surface through photosynthetic fixation, in water this is very much not the case &#8211; perhaps an ecosystem where sulfur, nitrogen or even oxygen are it&#8217;s principle &#8220;biological&#8221; component exists?</p>
<p>The main risk is that in the process of entering the lake to &#8220;have a look around&#8221; (at 4km below the surface, this is a bit of an engineering feat) we contaminate the lake. The various parties involved (Russian, American and British teams are all drilling) have been careful to try and avoid this, but in what may be an ecosystem radically different from our own, it seems to me that assessing what would be a contaminant is an impossible task &#8211; surely anything represents a contaminant.</p>
<p>Whatever the case, the next stage is to recover water, which is expected towards the end of 2012. Still more waiting, but hopefully, something truly fascinating will be at the end of the wait.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.holehouse.org/thoughts/a-can-of-sea-monsters/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

 Served from: www.holehouse.org @ 2013-06-19 18:29:53 by W3 Total Cache -->