# ときどきの雑記帖 2012

### 2012年12月10日

#### ■_

とある、基本情報処理技術者試験向けのテキスト本を眺めたのですが、 やっぱり「バグ累積曲線」とか「信頼度成長曲線」という言葉が出てきておりまして、 説明のところに「ロジスティック曲線」や「ゴンペルツ曲線」 で良く近似できることが知られている(大意)ということが 当然のように書かれてたんですけど、これってきちんとした エビデンス() はあるんですかね。 いやまあ少なくとも海外(北米?)でこの辺に関係する本を書いた人は (残念ながら公開できない)データを使ってきちんと評価してるっぽくはあるんですけども。 大体そんな単純な話なら、 「信頼度成長曲線」の「形」を表すための「××モデル」ってのがあんなにたくさん出るわけがない気がします。

#### ■_ デバッグの理論と実践

デバッグの理論と実践 なぜプログラムはうまく動かないのか｜Ohmsha

『ビューティフルコード』『Making Software』の著者の一人であり、GNU Data Display Debugger（DDD）
の開発者である著者が、なぜプログラムがうまくいかないか、について、その効率的な原因究明とデバッ
グ方法を提案します。なぜ「システマチック」で「自動的」なデバッグが必要なのか、そしてそれを実現
するための手法として、差分デバッグ、科学的手法といった具体的なテクニックの詳細を紹介。デバッグ

ラマにとって福音となる一冊です。


う、「ビューティフルコード」や「Making Software」にもあった名前だとは気がつかなかった。 で、 Andreas Zeller 経由で Why programs fail: A guide to systematic debugging - Andreas Zeller - debugging book にたどり着いてそこにあったスライドを片っ端から見てたんですが面白い。 一番印象に残ったのが http://www.st.cs.uni-saarland.de/whyprogramsfail/pdf/FixingTheDefect.pdf の 18枚目。「Traffic」の原則ってのがあって、T、R、…に(デバッグに関する)ある行動の 頭文字を当てはめていっているのだけど

The Traffic Principle

Track the problem
Reproduce
Automate
Find Origins
Focus
Isolate
Correct


だと。まあこれだけイキナリ見てもわけわからんでしょうが余裕のある向きは是非。

あんまり話題になってないのねえ。 大体(以下かなり自主的削除)。 Delta Debuggingの紹介 - SourceForge.JP Magazine : オープンソースの話題満載 netail.net(2007-06)

#### ■_ Zed

Tables » Hummus and Magnets

Tables
by Christian Plesner Hansen Posted on October 6, 2012

I have a thing for mechanical calculators and it recently occurred to me that I knew almost
nothing about two of the most famous ones: Babbage's difference engine and analytical engine.
This led me to read some of the papers from the mid 1800s that were written about them. This
blog post is the first of a few I'm planning to write about that.

わたしは機械式計算機がとても好きなのですが、つい最近になってから自分が最も有名な二つの機械式計算

(difference engine) と解析機関 (analytical engine) です。このことはわたしに 1800年代中頃にこれら
の機械について書かれた論文を読ませることに繋がりました。この投稿はそれについて書こうと計画してい
る何回かの投稿の最初のものです。

The analytical engine usually gets most of the attention but the difference engine is an
interesting invention in its own right. Not only did it solve an important problem, it is the
only one of the two that was complete enough to actually be built. This post about what made
the difference engine so important that Babbage spent decades trying to build it and why
British government was willing to pay the bill of over ₤17,000, more than the price of two
warships.

ていたものなのです。本投稿では、バベジがその製作に十年以上の歳月を費やさせ、
また英国政府にwarship 二隻分よりもさらに高額の17,000ポンドもの費用を払おうとさせるほど

Calculation

Today computation is cheap. Extremely cheap. Imagine the amount of math that goes into just
displaying the image on your screen right now: the layouts, colors, and fonts,
rendering it all on a physical display, and doing it again and again quickly and
smoothly enough that you don't even notice it's happening.

ためにどれほどの計算が行われるのか想像してみてください。レイアウトがあり、色があり、フォントが
あり、それらを物理ディスプレイ上にレンダリングし、それを何度も何度も素早く繰り返して

Computation is so cheap that it's easy to forget how expensive it was before electronic
calculators. It used to be that if you wanted to add two numbers together you had to actually add
those numbers together. Manually. Need to multiply or divide two numbers, even just a few digits?
Then you'll have to get the paper out and do long multiplication or long division. I just did a
long multiplication to make the image on the right here. I got it wrong twice before getting it
right and I went from “this'll be fun, I wonder if I still remember how to do this” to “god
this is so tedious” in about 30 seconds.

ように長いかけ算をやってみました。正しい答えを得るまでに二回間違えました。そしてそういった計算に対
する印象が 30秒ほどで
“this'll be fun, I wonder if I still remember how to do this”
から
“god this is so tedious”
になりました。

And those are just the basic building blocks of doing a calculation. Most interesting
computations like calculating interest or the position of the moon in six months
require you to do these manual computations over and over and over again. Or require
operations that you can't easily calculate by hand, like trigonometric functions.

さらに、これらは計算を行う単なる basic building blocks なのです。calculating interest や
position of the moon in six months のような、ほとんどの interesting computations はこれら
manual computations を何度も繰り返し行うことを要求します。さもなければ、三角関数のように簡単には

At this point you might be thinking: who cares where the moon is in six months? It turns out, back
in those days a lot of people did. In some cases people's lives depended on it.

ここであなたはこう考えるかもしれません。六ヶ月間の月の位置なんて誰が気にかけるのか。と。それは当

On the right here is a table of distances in degrees on the night sky from the center of the moon
to various stars at particular times. The first line gives the distance between the center of the
moon and Aldebaran on March 3, 1775 at noon, 3, 6, and 9 o'clock. Multiply that by 365 days, then
multiply it by a dozen stars, that gives you just some of the tables in this book, the first
edition of the Nautical Almanc and Astronomical Ephemeris from 1774, published from the Royal
Greenwich Observatory. The audience for the almanac were mariners. The first edition of 10,000
copies sold out immediately.

table です。一行目には 1775年3月3日の 0時、3時、6時、9時のそれぞれの時刻における月の中心からアル
デバランまでの距離が書かれています。365日を掛け、さらに a dozen stars を乗じることによって
1774年に Royal Greenwich Observatory から初版が出版された Nautical Almanc and Astronomical
Ephemeris の一部の内容が得られます。この航海暦 (almanac) の audience は mariner たちでした。
この本の初版一万部はあっという間に完売してしまいました。

To determine your longitude at sea you need to know the current time at a fixed point.
You can think of it sort of like navigating with time zones. If you know it's 4
o'clock in the afternoon Greenwich and it's noon where you are (which you can tell by
looking at the sun) then you know you're in the -4 time zone which is the one that
goes through eastern Canada, the eastern Caribbean and central South America. This is
a rough analogy but that's the gist of how it works.

を使った navigating のように考えることもできます。
もし Greenwich (グリニッジ) が4時のときに自分のいる場所が0時であることを知っていれば、
これは太陽の位置を見れば判断できます(which you can tell by looking at the sun)。
そして東カナダ、東カリブ、中央南アメリカといった -4 time zone に自分がいることがわかります。
これはおおざっぱな analogy ですが gist of how it works です。

Up until around 1850, before accurate clocks were made that could be carried on long voyages, a
reliable way to determine the current time was using lunar distance. The moon and stars in the
night sky move as a perfectly predictable clockwork. A given configuration occurs only once, and
you can calculate in advance precisely what the sky is going to look like at a later time. And
more importantly you can go the other way: given the precise configuration of the sky you can
calculate exactly what time it is.

clockwork のように動きます。

what the sky is going to look like at a later time.

そしてもっと重要なのは別の方法も取れるということです。

Actually you don't need the full configuration; all you need to know to calculate the
time is the distance in degrees from the center of the moon to any star. That's where
the almanac comes in. It precomputes those distances so that all a navigator needs to
do is measure the angle (typically using a sextant) and then look the value up in the
almanac. Okay that's actually just the basic principle, there's a lot more to it in
practice: you have to adjust for the distance from the center of the moon to the
circumference, for your position on the earth, for atmospheric refraction, etc. Being
a navigator takes a lot of skill. How do you make those adjustments by the way? More
tables of course.

うな距離をあらかじめ計算していて、航海士が必要なのは(典型的には六分儀 (sextant)を使って)角度を測
り航海暦で該当する値をさがすことです。とは言えこれは実際には基本原理に過ぎず、実践するにはさらに

you have to adjust for the distance from the center of the moon to the circumference,
for your position on the earth, for atmospheric refraction, etc.

もちろん、さらに数表を使うのです。

All this means that having accurate tables is extremely important. An undetected error in the
almanac means a navigation error which can mean shipwreck. This is made worse because many of
these tables are time dependent: one line in the almanac is useful on one day only. As a navigator
you're basically beta testing the data for every single day because nobody has had any reason to
use the data before.

このことは正確な数表を持つことはとても重要 (extremely important) であることを示しています。航海暦

なぜならそのデータを事前に使う理由のある人などいないのですから。

There are many sources of errors in numerical tables. Teams of human computers carried
out the manual calculations, a tedious and error prone process. (Incidentally, it
turns out that the better an understand you have of the calculation you're carrying
out the more likely you are to make mistakes as a computer.) Often the same value
would be calculated by more than one human computer and then compared to catch errors
– but checking is an error prone process in itself, and computers can (and did) copy
from each other. Then finally someone has to manually set the values in movable type
and print them, also an obvious source of errors.

を行いますが、退屈して間違いをする傾向があります。
(ついでに言えば、it turns out that the better an understand you have of the calculation
you're carrying out the more likely you are to make mistakes as a computer.)

が、チェックするそのこと自体がエラーの元となってしまいますし、

そして最終的には誰かが手作業で値を活字でセットしてそれを印刷しなければなりません。
これもまた明らかに数表のエラーの原因となります。

Babbage

Babbage was an unorthodox and very gifted mathematician. He was a fan of Leibniz which was still
something of a heresy at his college Trinity, home of Newton, Leibniz's arch rival. He was also
one of the founders of the Analytical Society whose goal it was to replace Newton's formalism for
calculus with Leibniz's. Incidentally, besides inventing calculus independently from Newton
Leibniz designed a mechanical calculating machine, the stepped reckoner.

バベジは unorthodox でありかつ非常に才能あふれる数学者でした。彼はライプニッツのファンであり、ライ
プニッツの arch rival だったニュートンの home である college Trinity においては異端でした。バベジ
はまたニュートンの formalism for calculus をライプニッツのそれで置き換えることを目標としていた
Analytical Society の founder の一人でもありました。ついでに言えば、ライプニッツはニュートンとは

Babbage recognized the problem of calculating tables, as most people did, but also had a solution:
the difference engine. The idea behind the difference engine is that most of the functions you
want to create tables for can be approximated by a polynomial. Here is the sine function along
with three approximating polynomials of increasing degree:

ありました。それが階差機関 (difference engine) です。階差機関の背景にあった考えは、数表を作るの
に必要な大部分の関数は多項式 (polynomial) によって近似可能であるというものです。以下に示したのは

As the degree of the polynomial increases the approximation quickly becomes better –
the degree-seven polynomial is quite close:

f_7(x) = x - \frac{x^3}{3!} + \frac{x^3}{5!} - \frac{x^7}{7!}

※ \frac{x^3}{5!} は \frac{x^5}{5!} の間違い?

Babbage's idea was to use mechanical means to calculate the approximating polynomials with high
accuracy not just print the result on paper but do the actual typesetting to eliminate even the
typographer as a source of errors.

バベジのアイデアは高い精度で近似多項式を計算するのに用いた機械的な手段を結果を紙に印字するためだ
けでなく、エラーの原因としての植字工 (typographer) をも排除するために typesetting にも使うという
ものでした。

But I'll stop here before we get to the juicy details of how the difference engine works and
save that for my next blog post.



2巻の最後で取りあげているのがヴァネヴァー・ブッシュで、 ノイマンまで行ってないのよねえ(1話で名前が出てるのに!)。 ヴァネヴァー・ブッシュ - Wikipedia

ブッシュの他、1、2巻で取りあげられた人物はこの辺り (モークリーエッカート はどうだったっけか。 エイケンの名前はなかったはず) チャールズ・バベッジ - Wikipedia アラン・チューリング - Wikipedia ジョン・アタナソフ - Wikipedia コンラート・ツーゼ - Wikipedia

### 2012年12月09日

#### ■_

Trouble At Code School

Trouble At Code School
Written by Mike James
Friday, 07 December 2012 09:31

Computer Science Education Week is a good time to consider how things are going in the teaching
of programming. The verdict however is more a "what are these guys doing" rather than
a "well done". What is wrong with code school?

Being brought up in the first era of the home computer I can't really understand how the coding
skill got lost. Back in the 1980s we were all keen to learn to program and even schools seemed to
be taking on the task of teaching kids to program. It wasn't a time of revolution because it just
seemed obvious that learning to program was an essential skill for the world we found ourselves
in. It was inevitable.

Then things started to go wrong.

(以下略)


#### ■_ 特許情報処理

どんな本だろう

ちょっと自分の興味の方向とは違った模様 自然言語処理シリーズ 特許情報処理：言語処理的アプローチ｜コロナ社

ぽちっとな

### 2012年12月08日

#### ■_

めも 小学館コミック -ビッグスリーネット-[ビッグコミックオリジナル：次号予告] MASTER キートン Reマスターは20日発売の号に掲載と。 前回(8月)掲載時に次は12月あたりと書かれていたのは覚えていたのだけど 逃していないかちょっと不安になったw 『MASTERキートン Reマスター』QUEST３『マリオンの壁』感想（ネタバレ有）｜～ Literacy Bar ～

さすがにこの金額では手が出せんなあ CD★県立地球防衛軍★オリジナル・サウンドトラック★忌野清志郎 - Yahoo!オークション

#### ■_ Onig...

Great document on Oniguruma (the way Regex works on Ruby) : ruby

Pat Shaughnessy's article on the Oniguruma VM is also quite interesting:
http://patshaughnessy.net/2012/4/3/exploring-rubys-regular-expression-algorithm



Ruby 2.0.0 uses Onigmo: https://github.com/k-takata/Onigmo



Really? Is there an article that has some sort of comparison?



Well, see the first paragraph in the README + https://github.com/k-takata/Onigmo/blob/master/doc/RE.



Cool. Thanks



#### ■_ Hints on programming language design

Hoare せんせーのむかーしの論文。 ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/403/CS-TR-73-403.pdf 1973/CARHoare: Hints on programming language design: 3.1 - Simplicity - Some language designers have replaced the objective of simplicity by that of modularity... (371K PDF) : programming

1973/CARHoare: Hints on programming language design: 3.1 - Simplicity - Some language designers have replaced the objective of simplicity by that of modularity... (371K PDF) : programming

It's crazy that this was written close to forty years ago:

The view that documentation is something that is added to a program after it has been
commissioned seems to be wrong in principle and counterproductive in practice. Instead,
documentation must be regarded as an integral part of the process of design and coding. A
good programming language will encourage and assist the programmer to write clear
self-documenting code, and even perhaps to develop and display a pleasant style of writing.
The readability of programs is immeasurably more important than their writeability.

(略) プログラムのリーダビリティ超重要。

So much awesome stuff in here:

Finally, it is absurd to make elaborate security checks on debugging runs, when no trust is
put in the results, and then remove them in production runs, when an erroneous result could
be expensive or disastrous. What would we think of a sailing enthusiast who wears his
lifejacket when training on dry land, but takes it off as soon as he goes to sea?



Written by C.A.R Hoare, reviewed by D. Knuth => Eh, what did you expect ?



Certain programming errors [...] in no case can they be allowed to give rise to machine or
implementation dependent effects, which are inexplicable in terms of the language itself.

Paper published in 1973, just after C exited Bell Labs. sigh



Written by C.A.R Hoare, reviewed by D. Knuth => Eh, what did you expect? って…・ 否定できないけどw

あれ、以前にこれの日本語訳とか読んだような気がしないでもない? wears his lifejacket when training on dry land, but takes it off as soon as he goes to sea? この辺の言い回しになんか見覚えが

#### ■_ なの

わたしも nano に放り込んでいるのがアーティスト分類でもアルバム分類でも、 もちろん曲名分類でも結構な長さになっているので ホイールくるくるで探すのが結構大変なのでどうにかしたいとは常々思ってたり。 一気にリストの半ば辺りまで飛ばすとかできないもんなんすかねー

#### ■_ コーディング標準

コーディング標準の話その1

Why I Have Given Up on Coding Standards | Richard Rodger

Why I Have Given Up on Coding Standards
Posted on November 3, 2012 by richardrodger

Every developer knows you should have a one, exact, coding standard in your company. Every
developer also knows you have to fight to get your rules into the company standard. Every
developer secretly despairs when starting a new job, afraid of the crazy coding standard some

It's better to throw coding standards out and allow free expression. The small win you get from
increased conformity does not move the needle. Coding standards are technical ass-covering. At
nearForm I don't want one, because I want everyone to think for themselves.

There's a lot of noise out there. The resurrection of JavaScript is responsible. One
“feature” in particular: optional semi-colons. Terabytes of assertion, conjecture
and counter-argument are clogging up the intertubes. Please go write some code instead.
You know who you are.

Well-meaning, and otherwise fabulous developers are publishing JavaScript coding
standards and style guides. You are all wrong. Stop trying to save the world.

Here's what's happening: when you started coding you had no idea what you were doing.
It was all fun and games until you lost an eye. Once you hurt yourself one too many
times with sloppy code, you came to understand that you were a mere apprentice.
Starting on the path to master craftsman, you soaked up Code Complete, The Pragmatic
Programmer, and of course, Joel.

And then, it happened. On the road to Damascus you gained insight. Your new grab bag
(looking back, that's hardly surprising). And now you needed to spread the word. What
worked for you will save others. You cajoled, you preached, you pestered. You lectured
your boss on the need for best practices and standards. And most unforgivable of all,
you blogged.

Most developers don't make noise. Those who make noise, get promoted. You got promoted.
You imposed your brilliant ideas on others, certain of victory. You wrote a coding
standards document, and you made it law.

And then, nothing. The same old slog, the same death marches, the same bugs, the same
misery. No silver bullet.

After a few years, you stopped coding and became a manager. You still know that coding
standards, rules and regulations are vital. All it requires is proper implementation.
You've never quite got there, but you'll keep trying. Hit ‘em over the head a bit
more. Code metrics! In any case, as a manager you get to delegate the pain away.

There is another road. Perhaps you went back to coding, or never left. Over time you
came to realize that you know so little, and all your wonderful ideas are sand castles.
You're washed up. This is the next level of insight.

Other people are smarter than you. Not some of them. All of them. The coder writing
the user interface? They are smarter than you … about the user interface. You're not
writing the code. Why don't you trust them? No, that's not the right question. They
will still mess up. Why are you making a bigger mess by telling them what to do?

You get to the point where you understand that people are not machines. You need to
results.

So why do most intelligent coders do exactly the opposite? What makes us such ready
dictators?

First, you transfer your own experiences onto others. But not everybody thinks like
you. Brains are pretty weird.

Second, control feels good. It's a comfortable hole in the sand. But you can't tell
coders what to do. Cats don't herd.

Third, you get to duck responsibility. Everybody on the team does. We followed the
rules! You failed. Yes, but we followed the rules! Well in that case, here's another
project…

Fourth, good intentions; best practices; professionalism; engineering – the seductions
of process. You are chasing the same gold stars you got when you were eight years old.
But how is the master craftsman judged? By results, only.

Fifth, idealism, the belief that you can understand the world and bend it to your will.
Something we're pretty good at as a species … after we fail a thousand times, and
with repeatable processes. Software projects are always one of a kind.

There are worse sins than these. You only need one of them to end up with a coding standard.

heart. They are a little message that you are not good enough. You cannot quite be
trusted. Without adult supervision, you'll mess up.

We started nearForm about a year ago, and one thing we really care about is writing
great code for our clients. In earlier lives, I've tried all the processes and methods
and rules of thumb. They all suck. None of them deliver.

Starting with the principle that our coders are really smart. That does work.

I expect everyone to write good clean code. You decide what that means. You decide if
you can sleep at night with random code layouts and inconsistent variable names. But
you know what, maybe they just don't matter for a 100 line node.js mini-server that
only does one thing. You decide.

It is your responsibility, because you can code.



#### ■_

Why I Have Given Up on Coding Standards : programming

There are really only two golden rules you need to follow:

Follow precedent set by others when modifying existing code.
Remain consistent in logical chunks.

There is hardly any problem with one module following one standard while another follows another
standard, but it is a huge problem if styles are mixed and matched within code blocks.


There is one golden rule to programming:

Write code as if your successor is an homicidal maniac who knows where you live.


Do they have automated tests? that could be the reason you are seeing push-back for trivial fixes,
because re-unit-testing a trivial fix may itself be a non-trivial task... especially for things
like variable renames and (hopefully private) function boundary/scope changes, and localized
conditional simplifications....

and, needless to say, the automated unit tests should be reviewed with the code. you don't have to
apply the same level of style or "correctness" to it, but they should at least satisfy
the following:

they cover all the "interesting" code
they assert that the code satisfies the requirements (or at least doesn't violate them)
they are easy to understand and follow along
they are easy to modify / maintain


Enforcing the standard is what actually gets the real benefits.

It's awesome when you are on a large code base, and certain aspects such as syntax, are
predictable the whole way through.



Let's think of an analogy with other collective, collaborative activities:

If I for some reason aren't following the rules of the road in 1% of my driving, I probably have a good reason for not doing so.
If I for some reason aren't following the rules of football in 1% of my games, I probably have a good reason for not doing so.
If I for some reason aren't following accepted accounting principles in 1% of my returns, I probably have a good reason for not doing so.
If I for some reason aren't following the guidelines for safe food handling in 1% of my cooking, I probably have a good reason for not doing so.

If you are a one-man-bad and no-one else is ever going to have to read, maintain nor integrate
with your code: go wild! If you are in a team you owe it to your colleagues to play nicely.


Coding standards promote code readability across the team, regardless of turnover. Homogenous code
also makes QA tools easier to write and maintain (at least in theory) and reduces merge noise in
the respository.

Anyone in a position of authority over developers has (hopefully) earned the right to impose such
things. Ideally the person imposing the standard actually writes code at the time, and/or consults
the team members about what the standard should be.

Any team member who can't or won't adapt to a coding standard probably doesn't belong on the team.



MISRA ってのもちらっと見えたなあ

#### ■_

コーディング標準その2。 X の。

X.Org Wiki - CodingStyle

This page describes the X server's current coding style. While the server was recently reformatted
to fit this style, most modules have varied and disparate coding styles. Above all, the cardinal
rule is to fit in: make sure your changes reflect the coding style of the surrounding code.

We use the indent command line in this script here:
http://cgit.freedesktop.org/xorg/util/modular/tree/x-indent.sh with manual editing
afterwards to fix the cases where indent gets hopelessly confused.

Four-space indents (no tabs, not even if your editor wants to collapse eight consecutive spaces down to a single tab)
スペース4個を単位としたインデント (タブは使わない)

78-column limit
1行は78カラム

Function return type (and any modifiers, eg static) on a line by itself
関数の戻り値の型(と static などの修飾子) はそれだけを独立して一行に

Opening curly brace on the same line as the control construct: if (foo) {
開きブレースは対応する制御構造と同じ行に置く

Closing braces aligned with the keyword that opened them (K&R not GNU)
閉じブレースはそれに対応する開きのキーワードに合わせた位置に

else on a new line from the closing } of the preceding if (i.e. not cuddling)
else は先行する if の閉じブレースから改行して置く

Opening curly brace for functions in column 0
関数本体の開きブレースは0桁目に

Keywords punctuated like if (x >= 0)
キーワードとカッコは離す

Functions punctuated like doSomethingClever(a, b, c);
関数名とカッコはくっつける

case aligned in the same column as the switch
case は switch と桁をあわせる

If wrapping is required, function arguments to be aligned to the opening parenthesis of that column
wrapping は必要であれば、関数の引数は(引数を囲む)開きカッコに桁をあわせる

Wrap structs in typedefs
構造体は typedef する

C-style comments, rather than C++/C99-style // foo
コメントは // ではなく C 形式で

C89 + some extensions, see http://cgit.freedesktop.org/xorg/xserver/tree/doc/c-extensions
C89 + 一部の拡張

Notable objectionable things in the current coding style:

Most structs have a typedef both for the struct and for a pointer to the struct.


#### ■_

で、これも盛り上がる。

The X.Org CodingStyle : programming

Most structs have a typedef both for the struct and for a pointer to the struct.

I've never understood the point of typedefs for structure pointers. We have a tool to define
pointers already, it's the star. It's not like we have int_ptr and char_ptr typedefs.

    Function return type (and any modifiers, eg static) on a line by itself

I am working on a project in my OS class that has this style requirement. What is the reason for this choice?



Being able to grep/ack for ^functionname is so incredibly handy on large codebases.


cscope >> ctags >> grep ^functionname


GNU coding standards explain this convention:

It is also important for function definitions to start the name of the function in column one.
This helps people to search for function definitions, and may also help certain tools recognize them.

I'm not sure which tools are indicated here... perhaps someone else knows?



grep ^functionname *.c

I wouldn't know any other tool that would find this useful but there might be or might have been,
decades ago, when the GNU coding standards was written down on stone tablets.



Indeed. Etags and ctags seem to find them just fine either way.



78 character limit is silly. When was the last time you worked on an 80 column display? Even in
1985 that shit was legacy.

How about we move up to 1980's tech and go with 132 column widths. Or fuck it, how about any width
because we code on 23" high definition multi-head displays because it's no longer 1980.


Because code isn't just viewed at full screen on someone's big monitor. It gets viewed in diffs,
commit logs, terminals and numerous other places that are not your IDE.

People read shorter lines better than longer lines.

A lot of people don't have one window open at a time so while my monitor allows windows to be very
wide, it's retarded to waste space like that. So I have multiple things open and viewable as is the
intelligent way to use a large screen.

Not everyone writes code on identical systems. So just because I may invest in the biggest possible
monitor does that mean everyone else should have to deal with tons of wrapping?


さすがに80桁はどうかと思うけど、どこまで許せばいいのかなあ。

### 2012年12月07日

#### ■_

ちょうど退勤の時間でした。

いやまあなんというかねえ

パナソニック　Let's noteシリーズ、タブレットにもなるAX2シリーズが登場：PC Online 10.1型液晶の「J10」はオンラインのみの販売となった。 なんだってーっ

#### ■_ win32api

カレンダーコントロールの動作にびっくりした (一月単位表示のときに月の部分をクリック →年単位表示に→年の部分をクリック→10年単位の表示に)

#### ■_ Should Developers Start Learning C++?

InfoQ。 Should Developers Start Learning C++? developer は C++を学び始めた方が良いか?

Should Developers Start Learning C++?

Should Developers Start Learning C++?

Posted by Jonathan Allen on Dec 06, 2012

With the introduction of C++ 11 and C++ CX there has been a lot of renewed interest in the
language. And a lot of developers, especially Windows developers, are wondering if they should
set aside C# and Java in favor of it. John Sonmez argues no.

In his article titled Why C++ Is Not ‘Back', John Sonmez argues that there are only three
reasons to use C++:
John Sonmez は Why C++ Is Not ‘Back'というタイトルの彼の article で、C++ を使う理由は
たった三つしかないと主張しました

You absolutely need to ink out every bit of performance possible out of your software and
you would like to do that with a language that will support OO abstractions.
性能をとことんまで追い求める必要があり、できればオブジェクト指向も使いたい

You are writing code which will directly interface with raw hardware.  (Example: you are
writing a low level driver.)
生のハードウェアに対して直接インターフェースを持ったコードを書く場合

Memory control and timing is of absolute importance, so you must have completely
deterministic behavior in your system and the ability to manually manage memory.  (Think
real time embedded operating system controlling a moving piece of machinery.)
メモリ制御やタイミングがとにかく重要な場合。

Herb Sutter, who has heavily praised this article for offering a “a thoughtful hype-free opinion”

Servicing, which is harder when you depend on a runtime.

Testing, since you lose the ability to test your entire application (compare doing
for the first time on an end user's machine).



まあそのなんだ。 日本語翻訳記事が出たら誰かがそれをツイートして、 それのリツイートがわたしのTLを賑やかすんだろうなあ。 げしょ。

#### ■_

ついったで流れてきた Twitter / domxwop: 「例えば君が内定とれる確率が100分の1だとするじゃ ... からふと思いついて計算してみた。

> x<-1:50
> sapply(x, function(z) return(1-(1-1/z)**z))
[1] 1.0000000 0.7500000 0.7037037 0.6835938 0.6723200 0.6651020 0.6600833
[8] 0.6563911 0.6535606 0.6513216 0.6495061 0.6480044 0.6467415 0.6456647
[15] 0.6447356 0.6439259 0.6432138 0.6425828 0.6420197 0.6415141 0.6410576
[22] 0.6406435 0.6402660 0.6399206 0.6396033 0.6393108 0.6390403 0.6387894
[29] 0.6385560 0.6383385 0.6381352 0.6379447 0.6377660 0.6375978 0.6374395
[36] 0.6372900 0.6371487 0.6370149 0.6368880 0.6367676 0.6366531 0.6365440
[43] 0.6364402 0.6363411 0.6362464 0.6361559 0.6360692 0.6359863 0.6359067
[50] 0.6358303
> plot(sapply(x, function(z) return(1-(1-1/z)**z)))


プロットしてみた。

### 2012年12月06日

#### ■_

• Mobile
• Agile principles
• Master more than one programming language
• Learn a Javascript ‘dsl' language
• Get to know HTML5 and CSS3
• Responsive design to a certain extent can be done by pure CSS techniques, some tips:
• Continuous integration / delivery

まあ納得というか驚くようなものはないというか。

### 2012年12月05日

#### ■_ ぜっぱん

んーむ。まあちょっと前の本だしなあ

### 2012年12月04日

#### ■_

Julia のソースコード全然読み進められんですばい JuliaLang/julia

#### ■_ おべろん

Oberon Day 2011 - Talks

Keynote

Niklaus Wirth, Prof. emeritus, ETH Zurich, Switzerland
"Ceres and Oberon, Then and Now"

The motivation behind project Oberon 25 years ago was the creation of a computer, a language
and an operating system that concentrated on features that were necessary, sufficient,
explicable, justifiable, and efficiently implementable. These characteristics are particularly
desirable - and rather indispensable - for teaching programming, and for software design in general.

We briefly explain why and how project Oberon came into existence. The steadily growing maze of
complexity and bulk of software indicates that these goals are still relevant, actually more so
than ever. This has spurned new activities with and about Oberon. We present a brief overview,
covering recent work in Zurich on both hardware and software.

Biography: Niklaus Wirth taught Computer Science at ETH in Zurich from 1968 until his retirement in
1999. He designed the languages Pascal (1970), Modula-2 (1979), and Oberon (1986). He was also the
principal designer of the computers Lilith (1978) and Ceres (1986-89).



#### ■_

Top technology and software trends - 2012 - 2013 - Implements Developer {

I'm still getting a lot of visits on my old 2010 post about top software development trends
- so I guess there is interest in this topic. I feel many of my points were fair in that post
and many still stand, but the additions from the last years are clearly missing.

Here's the list of new technology trends to follow based on my experience and prediction:


### 2012年12月03日

#### ■_

でまあ、結局ユーザー登録しちゃったわけなんですよ → 押井守監督の「勝つために見る映画」：日経ビジネスオンライン 登録を決断させた最終的な引き金はこれだったんですが パト2は「何かを成したい中間管理職」必見の映画：日経ビジネスオンライン

「優秀な中間管理職」が壊れていく必然：日経ビジネスオンライン を読んで、考えるところがあったりなかったり。

パトレイバー2の回のラスト(ユーザー登録してないと読めないと思います。申し訳ない) の そういう意味では、今でも「パトレイバー」と言えば後藤隊長、と言われることが多いんだけど、 それは後藤というキャラクターが見た人の印象に残ってるからだよね。 同世代のおっさんはもちろん、若い人にとっても。実はその後藤隊長にはモデルがいるんです。 がすげー気になるんですがー

#### ■_

「銀輪の巨人」で描かれたような、 望むと望まざるとを問わず変化しなければならない状況になりつつあるような気がします。 そういったとき、「小さく賭ける」(そういったことができる)のは かなり重要なことなのではないでしょうか。 土壇場に追い詰められてから満塁逆転ホームラン狙っても期待はずれになるのは世の常ですから。

ついでに「リバース・イノベーション」について。 従来よく見られた先進国市場向けのものを手直しして(そうやって作った「廉価版」を)、 発展途上国などの市場に投入する「グローカリゼーション」ではなく、 まずそういった市場で売るにはどう言ったものが必要なのかから始めて 先進国の富裕層にも広げていくという手法が重要だという話です。 簡単には。 いくつか事例が取りあげられていますが、 医療品なんかはなるほどという説得力がありました (こちらも従来の手法を完全否定してはいません)。 んでまあそのためには開発体制やら組織やらを変えなければならない云々とくるのですが さて日本企業に当てはめてみると…

しかしこういう知識やら情報を得たとしてもそれを活用する(できる) 立場(地位)にはほど遠いのですが、なんでこんなの幾つも読んでるんでしょうねえ ワタクシは ○|￣|＿

### 2012年12月02日

#### ■_

この辺はちょっと前に買って読み終わった。

「マンガでよむ社会学」は、出てすぐの今じゃなくてノベルティ(クリアファイルとか) がつくのを待ってからのが良かったかもしれず。

Maker の感想(書評ぢゃないよ)どうしようかなあ。 書きたい気もするけどめんd(ry いやいや、アフィリエイト貼り付けるならそれくらいはしないとだよなあ。

#### ■_ Perl 6

ううむ知らないのがたくさんあるなあ。 まじめに追いかけてないせいなんだろうけど。

#### ■_ Big E

アレですアレ。 歴史群像(たぶん)の記事では Enterprise の艦名は引き継がれないとかあったような 気もするけど変わったのかな。しかし先代といいこれといい最後はスクラップすか。

【正規空母】戦後の空母を語るスレ14番艦【軽空母】

830 名無し三等兵 [sage] 2012/12/02(日) 13:32:02.26 ID:??? Be:
エンタープライズ退役＝次世代空母に艦名引き継ぎ－米
http://www.jiji.com/jc/c?g=int_30&k=2012120200029

おお艦名をフォード級三番艦に引き継ぐのが決まったのか
さらば「Big E」また会う日まで

831 名無し三等兵 [sage] 2012/12/02(日) 14:36:31.35 ID:??? Be:
次で何代目だ？

832 名無し三等兵 [sage] 2012/12/02(日) 16:13:08.37 ID:??? Be:
>>830
Wikipedia 職人仕事早すぎw。人が死んだら即座に没年月日更新されるだけあるわw

http://ja.wikipedia.org/wiki/%E3%82%B8%E3%82%A7%E3%83%A9%E3%83%AB%E3%83%89%E3%83%BBR%E3%83%BB%E3%83%95%E3%82%A9%E3%83%BC%E3%83%89%E7%B4%9A%E8%88%AA%E7%A9%BA%E6%AF%8D%E8%89%A6
フォード級空母は10艦の建造が計画されている[1]。現在そのうち3艦の建造が進行または予定されている。
CVN-80エンタープライズ


#### ■_

あとでよむ。

Debugging: Art or Science? | Dr Dobb's

Debugging is the hardest part of programming to describe systematically, because its very purpose
is to deal with unsystematic behavior. A programmer plans for a program to behave in a particular
way; debugging is what happens after the programmer discovers that the program is not behaving as
planned. As a result, it is hard to write — or even think — about debugging in general. Every bug
is different, so saying something that applies to the act of debugging requires finding something
general to say about a bunch of unrelated specific cases that defy generalization by their very
nature.

デバッグはプログラミングの中で、systematically に説明するのが最も難しい部分です。
なぜなら、デバッグの目的というものが unsystematic な振る舞いを扱うものだからです。
プログラマーはプログラムを、paricular way (特定の方法?) で動作するように計画しますが
デバッグは作ったそのプログラムが自分の計画したように動いていないことにプログラマーが

デバッグ作業 (act of debugging) に対して何かを主張することは、



### 2012年12月01日

#### ■_

Amazon.co.jp： 魏志 文帝紀 建安マエストロ! 1 (MFコミックス フラッパーシリーズ): 中島 三千恒: 本

---------ここから追記-----------
どうやら当作品を掲載していたwebコミックが廃刊になってしまったようです。



なんだってーっ ○|￣|＿

#### ■_ Strange Loop 2012

News - Strange Loop を眺めていると、これから公開予定のものになかなか面白そうなものが。 いくつか抜き出してみるとこんな感じ

• 10-Dec-2012 Go: code that grows with grace Andrew Gerrand
• 10-Dec-2012 Computer Architecture of the 1960's Carlton Mills
• 31-Dec-2012 Plan: a new dialect of Lisp David Kendal
• 7-Jan-2013 Scaling scalability: Evolving Twitter Analytics Dmitriy Ryaboy
• 7-Jan-2013 Information Rich Programming with F# 3.0 Donna Malayeri
• 7-Jan-2013 monad examples for normal people, in Python and Clojure Dustin Getz
• 14-Jan-2013 Compiling Scala to LLVM Geoff Reedy
• 18-Feb-2013 A Type Driven Approach to Functional Design Michael Feathers
• 25-Feb-2013 Grace: an open source educational OO language Michael Homer
• 8-Apr-2013 Wolfram's data analysis platform Taliesin Beynon
• 8-Apr-2013 Programming by Voice: becoming a computer whisperer. Tavis Rudd
• 15-Apr-2013 Numeric Programming in Scala with Spire Tom Switzer, Erik Osheim
• 15-Apr-2013 A Whole New World Gary Bernhardt

んで、InfoQ 以外にもこちらでスライドが公開されてました。 こっちならユーザー登録の必要なし。 strangeloop2012/slides at master · strangeloop/strangeloop2012

#### ■_ LLVM Proposes Adding Modules to C

reddit で盛り上がってたあの話題が InfoQ で記事に。 Apple's proposal for modules in C(++) [PDF slides] : programming

LLVM Proposes Adding Modules to C

LLVM Proposes Adding Modules to C

Posted by Alex Blewitt on Nov 30, 2012

At the November LLVM developers meeting, Doug Gregor of Apple gave a presentation on adding
modules to C. From the talk's abstract:

The C preprocessor has long been a source of problems for programmers and tools alike.
Programmers must contend with widespread macro pollution and include-ordering problems due to
ill-behaved headers. Developers habitually employ various preprocessor workarounds, such as
LONG_MACRO_PREFIXES, include guards, and the occasional #undef of a library macro to mitigate
these problems.



まあ翻訳記事が出るでしょう。

#### ■_

Eli Bendersky's website » Life of an instruction in LLVM

Life of an instruction in LLVM
November 24th, 2012 at 3:37 pm

LLVM is a complex piece of software. There are several paths one may take on the quest of
understanding how it works, none of which is simple. I recently had to dig in some areas of LLVM

LLVM は複雑な piece of software です。LLVM がどのような動作をしているのかを理解するには
いくつかの経路 (path) がありますが、単純なものはありません。最近、わたしは LLVM の一部について
dig in しなければなりませんでした。それまでわたしは LLVM には馴染みがありませんでしたから、
この article はその quest の one of the outcomes です。

What I aim to do here is follow the various incarnations an "instruction"
takes when it goes through LLVM's multiple compilation stages, starting from a
syntactic construct in the source language and until being encoded as binary machine
code in an output object file.

multiple compilation を通じた various incarnations an "instruction"
を追いかけるということです。

This article in itself will not teach one how LLVM works. It assumes some existing
familiarity with LLVM's design and code base, and leaves a lot of "obvious"
details out. Note that unless otherwise stated, the information here is relevant to
LLVM 3.2. LLVM and Clang are fast-moving projects, and future changes may render parts
do my best to fix them.

「明らかな」details の多くについては触れません。特に注釈を入れない限り、ここで述べる情報は LLVM 3.2
についてのものです。LLVM と Clang は fast-moving projects であり、render parts が将来変更されて

If you notice any discrepancies, please let me know and I'll do my best to fix them.

Input code

I want to start this exploration process at the beginning – C source. Here's the
simple function we're going to work with:
ではまず、処理の始まりである C で書かれたソースコードから始めましょう。

int foo(int aa, int bb, int cc) {
int sum = aa + bb;
return sum / cc;
}

Clang

Clang serves as the front-end for LLVM, responsible for converting C, C++ and ObjC source into
LLVM IR. Clang's main complexity comes from the ability to correctly parse and semantically
analyze C++; the flow for a simple C-level operation is actually quite straightforward.

Clang は、C や C++、Ojbective C のソースを LLVM IR への変換を行う LLVM のフロントエンドとして
serves します。Clang の複雑性は主に C++ を正しく parse して semantilally な analyze をする能力
に由来しています。単純な C レベルの operation に対するフローはとても straightforward なものです。

Clang's parser builds an Abstract Syntax Tree (AST) out of the input. The AST is the main
"currency" in which various parts of Clang deal. For our division operation, a
BinaryOperator node is created in the AST, carrying the BO_div "operator kind" [1].
Clang's code generator then goes on to emit a sdiv LLVM IR instruction from the node, since this
is a division of signed integral types.

Clang の parser は入力に対する出力として抽象構文木 (Abstract Syntax Tree (AST)) を build します。
この AST は Clang が扱う様々なパーツの main 「currency」 です。
わたしたちの除算操作に対しては、この AST 中に BO_div "operator kind" [1] を
carry する BinaryOperator node がひとつ生成されます。
Clang のコードジェネレーターはそれから、符号つき整数型の除算のための sdiv LLVM IR instrcution を

LLVM IR

Here is the LLVM IR created for the function [2]:

define i32 @foo(i32 %aa, i32 %bb, i32 %cc) nounwind {
entry:
%div = sdiv i32 %add, %cc
ret i32 %div
}

In LLVM IR, sdiv is a BinaryOperator, which is a subclass of Instruction with the opcode SDiv [3].
Like any other instruction, it can be processed by the LLVM analysis and transformation passes.
For a specific example targeted at SDiv, take a look at SimplifySDivInst. Since all through the
LLVM "middle-end" layer the instruction remains in its IR form, I won't spend much time
talking about it. To witness its next incarnation, we'll have to look at the LLVM code generator.

LLVM IR 中では sdiv は Instruction のサブクラスである BinaryOperator である opecode SDiv [3] です。
ほかの命令と同様に、これは LLVM analysis と transformation passes によって処理が可能です。
SDiv をターゲットとする specific example として SimplifySDivInst を取りあげます。
LLVM の "middle-end" layer 全体を通じて insruction は IR form であり続けるので、
それについては詳しく述べません。
To witness its next incarnation, LLVM コードジェネレーターを見る必要があります。

The code generator is one of the most complex parts of LLVM. Its task is to "lower" the
relatively high-level, target-independent LLVM IR into low-level, target-dependent "machine
instructions" (MachineInstr). On its way to a MachineInstr, an LLVM IR instruction passes through
a "selection DAG node" incarnation, which is what I'm going to discuss next.

コードジェネレーターは LLVM の最も複雑な parts のひとつです。コードジェネレーターの task は
high-level な target-independent LLVM IR を相対的に 「lower」 level な target-dependent の
「machine instructions」(MachineInstr) とすることです。MachineInstr への変換では、LLVM IR
instruction は「selection DAG node」incarnation を passess through します。このことは次に論じます。

Selection DAG node

Selection DAG [4] nodes are created by the SelectionDAGBuilder class acting "at the service of"
SelectionDAGISel, which is the main base class for instruction selection. SelectionDAGIsel goes
over all the IR instructions and calls the SelectionDAGBuilder::visit dispatcher on them. The
method handling a SDiv instruction is SelectionDAGBuilder::visitSDiv. It requests a new SDNode
from the DAG with the opcode ISD::SDIV, which becomes a node in the DAG.

Selection DAG ノード [4] は SelectionDAGISel のサービスを実行する SelectionDAGBuilder クラスに
よって生成されます。このクラスは instrucion selection のための main base クラスです。
SelectionDAGISel はすべての IR instructions を over all して、
それぞれの instruction に対して SelectionDAGBuilder::visit dispatcher を呼び出します
SDiv instruction を handle するメソッドが SelectionDAGBuilder::visitSDiv です。
このメソッドは DAG 中で node となる、DAG with opcode ISD::SDIV からの新しい SDNode を要求します。

The initial DAG constructed this way is still only partially target dependent. In LLVM
nomenclature it's called "illegal" – the types it contains may not be directly
supported by the target; the same is true for the operations it contains.

この方法で構築された initial DAG はまだ部分的にしか target dependent ではありません。
LLVM nomenclature ではこれは "illegal" と呼ばれています。
- the types it contains may not be directly supported by the target;
the same is true for the operations it contains.

There are a couple of ways to visualize the DAG. One is to pass the -debug flag to llc,
which will cause it to create a textual dump of the DAG during all the selection
phases. Another is to pass one of the -view options which causes it to dump and
display an actual image of the graph (more details in the code generator docs). Here'
s the relevant portion of the DAG showing our SDiv node, right after DAG creation (the
sdiv node is in the bottom):

DAG の visualize には二つのやり方があります。ひとつは llc に -debug フラグを渡すというものです。
これは selection phases すべてにおいて DAG の textual dump を生成させます。もう一つは graph の
actual image の damp と display を実行させる -view オプションの一つを渡すというものです
(もっと詳しいことはコードジェネレーターのドキュメントを参照してください)。
Here's the relevant portion of the DAG showing our SDiv node,
right after DAG creation (the sdiv node is in the bottom):

Before the SelectionDAG machinery actually emits machine instructions from DAG nodes,
these undergo a few other transformations. The most important are the type and
operation legalization steps, which use target-specific hooks to convert all operations and
types into ones that the target actually supports.

SelectionDAG が machinery actually に DAG ノードから machine instructions を emit する前にいくつ
かの other transformatiosn が陰で行われています。そのうち最も重要なものが型と operation の
legalization steps です。このステップはすべての operations と型を target が実際にサポートしてい
るものへと変換するために target-specific hooks を使います。

"Legalizing" sdiv into sdivrem on x86

The division instruction (idiv for signed operands) of x86 computes both the quotient and the
remainder of the operation, and stores them in two separate registers. Since LLVM's instruction
selection distinguishes between such operations (called ISD::SDIVREM) and division that only
computes the quotient (ISD::SDIV), our DAG node will be "legalized" during the DAG
legalization phase when the target is x86. Here's how it happens.

x86 の除算命令 (符号つきオペランドに対する idiv) は商と剰余の両方を一度に計算し、それらをふたつの

operation と商だけを計算する (ISD::SDIV という) instruction とを区別しますから、先の DAG node は
target が x86 であるときに DAG legalization phase で「legalize」されます。
Here's how it happens.

An important interface used by the code generator to convey target-specific information to the
generally target-independent algorithms is TargetLowering. Targets implement this interface to
describe how LLVM IR instructions should be lowered to legal SelectionDAG operations. The x86
implementation of this interface is X86TargetLowering [5]. In its constructor it marks which
operations need to be "expanded" by operation legalization, and ISD::SDIV is one of them.
Here's an interesting comment from the code:

target-specific な情報を generally target-independent algorithms へ伝えるためにコードジェネレー
ターによって使用される重要なインターフェースが TargetLowering です。ターゲットは、LLVM IR
instructions をどのように legal SelectionDAG operations に lower すべきかを記述するためにこの
インターフェースを実装します。このインターフェースの x86 implementaion は X86TargetLowering [5]
です。そのコンストラクター中では、operation legalization によって "expanded" する必要
のある operations をマークします。ISD::SDIV はそうやってマークされたものの一つです。

// Scalar integer divide and remainder are lowered to use operations that
// produce two results, to match the available instructions. This exposes
// the two-result form to trivial CSE, which is able to combine x/y and x%y
// into a single instruction.
スカラー整数除算と remainder は、利用できる instruction にマッチするように
二つの結果を生成する operations を使うため lowered されます。
この exposes は二つの結果(商と剰余)を、  x/y と x%y とをひとつの instruction に
combine できるような trivial CSE にします。

When SelectionDAGLegalize::LegalizeOp sees the Expand flag on a SDIV node [6] it
replaces it by ISD::SDIVREM. This is an interesting example to demonstrate the
transformation an operation can undergo while in the selection DAG form.

SelectionDAGLegalize::LegalizeOp は SDIV node [6] に Expand flag を認めたとき、そのノードを
ISD::SDIVREM に置き換えます。これはある operation の transformation が、Selection DAG form
であっても可能であることを demonstrate する interesting example です。

Instruction selection – from SDNode to MachineSDNode

The next step in the code generation process [7] is instruction selection. LLVM provides a generic
table-based instruction selection mechanism that is auto-generated with the help of TableGen. Many
target backends, however, choose to write custom code in their SelectionDAGISel::Select
implementations to handle some instructions manually. Other instructions are then sent to the
auto-generated selector by calling SelectCode.

コード生成プロセス [7] の次のステップは instruction selection です。LLVM は、 TableGen の助けに
よって自動生成される generic table-based instruction selection mechanism を提供します。
Many target backedns ではしかしながら、いくつかの instructions を manually  に handle する
SelectionDAGISel::Select の実装でカスタムコードを書くことを選択しています。
other instructions はそれから、SelectCode の呼び出しによって auto-generate されたセレクターへ

The X86 backend handles ISD::SDIVREM manually in order to take care of some special cases and
optimizations. The DAG node created at this step is a MachineSDNode, a subclass of SDNode which
holds the information required to construct an actual machine instruction, but still in DAG node
form. At this point the actual X86 instruction opcode is selected – X86::IDIV32r in our case.

X86 の backend では、一部の特殊なケースを取り扱うためと最適化のために ISD::SDIVREM を manually に
handle しています。このステップで生成された DAG node は SDNode のサブクラス MachineSDNode で、

この時点で実際の X86 instrucion opcode 、わたしたちのケースでは X86::IDIV32r が選択されます。

Scheduling and emitting a MachineInstr
スケジューリングと MachineInstr の emitting

The code we have at this point is still represented as a DAG. But CPUs don't execute
DAGs, they execute a linear sequence of instructions. The goal of the scheduling step
is to linearize the DAG by assigning an order to its operations (nodes). The simplest
approach would be to just sort the DAG topologically, but LLVM's code generator
employs clever heuristics (such as register pressure reduction) to try and produce a
schedule that would result in faster code.

ここでのわたしたちのコードはまだ DAG として represent されています。しかし CPU は DAGs を実行しま
せん。CPU が実行するのは instructions の linear sequence です。このスケジューリングステップの目標
は、assigning an order to its operations (nodes) による DAG の linearize です。もっとも単純なアプ
ローチは DAG をトポロジカル的にソートしてしまうことですが、LLVM のコードジェネレーターはより高速な
コードを得られるように try and produce するもっと賢い (egister pressure reduction のような)
heuristics を employ しています。

Each target has some hooks it can implement to affect the way scheduling is done. I won't dwell
on this topic here, however.

Finally, the scheduler emits a list of instructions into a MachineBasicBlock, using
InstrEmitter::EmitMachineNode to translate from SDNode. The instructions here take the
MachineInstr form ("MI form" from now on), and the DAG can be destroyed.

instructions のリストを MachineBasicBlock に emit します。
ここで instructions は MachineInstr form (以後 "MI form" と表記)となり、
DAG は破棄可能になります。

We can examine the machine instructions emitted in this step by calling llc with the
-print-machineinstrs flag and looking at the first output that says "After
instruction selection":

このステップでは -print-machineinstrs フラグつきで llc を呼び出して、その出力の最初にある
"After instruction selection" を見ることで emit された machine instructions の

# After Instruction Selection:
# Machine code for function foo: SSA
Function Live Ins: %EDI in %vreg0, %ESI in %vreg1, %EDX in %vreg2
Function Live Outs: %EAX

BB#0: derived from LLVM BB %entry
Live Ins: %EDI %ESI %EDX
%vreg2<def> = COPY %EDX; GR32:%vreg2
%vreg1<def> = COPY %ESI; GR32:%vreg1
%vreg0<def> = COPY %EDI; GR32:%vreg0
%EAX<def> = COPY %vreg3; GR32:%vreg3
CDQ %EAX<imp-def>, %EDX<imp-def>, %EAX<imp-use>
%vreg4<def> = COPY %EAX; GR32:%vreg4
%EAX<def> = COPY %vreg4; GR32:%vreg4
RET

# End machine code for function foo.

Note that the output mentions that the code is in SSA form, and we can see that some
registers being used are "virtual" registers (e.g. %vreg1).

(たとえば %vreg1 のような) が使われていることに注意してください。

Register allocation – from SSA to non-SSA machine instructions
レジスター割り付け － SSA から non-SSA machine instructions へ

Apart from some well-defined exceptions, the code generated from the instruction selector is in
SSA form. In particular, it assumes it has an infinite set of "virtual" registers to act
on. This, of course, isn't true. Therefore, the next step of the code generator is to invoke a
"register allocator", whose task is to replace virtual by physical registers, from the
target's register bank.

SSA form をしています。
In particular,
SSA form では無限個の「仮想レジスター」を持っていると仮定されています。
もちろんこれ(無限このレジスターがあること)は真実ではありません。したがって
code generator の next step は、
ターゲットの register bank を基に仮想レジスターを物理レジスターに置き換える
task を持った "register allocator" の invoke です。

The exceptions mentioned above are also important and interesting, so let's talk

それらについてもう少し述べることにしましょう。

Some instructions in some architectures require fixed registers. A good example is our division
instruction in x86, which requires its inputs to be in the EDX and EAX registers. The instruction
selector knows about these restrictions, so as we can see in the code above, the inputs to IDIV32r
are physical, not virtual registers. This assignment is done by X86DAGToDAGISel::Select.

その良い例が x86 の除算命令で、これは EDX レジスターと EAX レジスターを要求します。
instruction selector はこういった制限について知っています。
IDIV32r に対する入力は物理レジスターであって 仮想レジスターではないことから

この代入は X86DAGToDAGISel::Select によって行われます。

The register allocator takes care of all the non-fixed registers. There are a few more
optimization (and pseudo-instruction expansion) steps that happen on machine
instructions in SSA form, but I'm going to skip these. Similarly, I'm not going to
discuss the steps performed after register allocation, since these don't change the
basic form operations appear in (MachineInstr, at this point). If you're interested,

レジスター allocator は non-fixed なレジスター群すべての面倒をみます。
さらにいくつかの SSA form 中の machine instructions に対して行われる最適化
(と疑似 instruction expansion の)ステップがありますが、ここではそれをスキップします。
また、register allocation のあとで行われるステップについても論じません。
なぜなら、これらのステップでは現れた operations の basic form (ここでは MachieInstr)
を変えることはしないからです。

Emitting code
コードの emitting

So we now have our original C function translated to MI form – a MachineFunction
filled with instruction objects (MachineInstr). This is the point at which the code
generator has finished its job and we can emit the code. In current LLVM, there are
two ways to do that. One is the (legacy) JIT which emits executable, ready-to-run code
directly into memory. The other is MC, which is am ambitious object-file-and-assembly
framework that's been part of LLVM for a couple of years, replacing the previous
assembly generator. MC is currently being used for assembly and object file emission
for all (or at least the important) LLVM targets. MC also enables "MCJIT",
which is a JIT-ting framework based on the MC layer. This is why I'm referring to
LLVM's JIT module as legacy.

さてここまでで元の C で書かれた関数から MI form へと変換されました。
この form は instruction objects (MachineInstr) で fill された MachineFunction です。
ここはコードジェネレーターがその job を完了させる場所で、コードのemitができます。

もう一つは MC で、これは二年ほど前にLLVM の一部となった ambitious な

MC は現状LLVM がターゲットとしているすべて(少なくともその重要なもの)に対する
assembly and object file emission で使われています。
MC はまた MC layer に基礎を置くJIT-ting framework である「MCJIT」も有効にします。
これがわたしがJIT モジュールを legacy とみなした理由です。

I will first say a few words about the legacy JIT and then turn to MC, which is more
universally interesting.

まず初めに legacy JIT について簡単に述べた後、より universally interesting な MC へ移ります。

The sequence of passes to JIT-emit code is defined by LLVMTargetMachine::addPassesToEmitMachineCode.
It calls addPassesToGenerateCode, which defines all the passes required to do what most of this
article has been talking about until now – turning IR into MI form. Next, it calls addCodeEmitter,
which is a target-specific pass for converting MIs into actual machine code. Since MIs are already
very low-level, it's fairly straightforward to translate them to runnable machine code [8]. The x86
code for that lives in lib/Target/X86/X86CodeEmitter.cpp. For our division instruction there's no
special handling here, because the MachineInstr it's packaged in already contains its opcode and
operands. It is handled generically with other instructions in emitInstruction.

JIT-emit code へ渡される sequence は LLVMTargetMachine::addPassesToEmitMachineCode で定義されて
います。これは本 article でここまでに述べたことの大部分で必要となる pass のすべてを定義している
addPassesToGenerateCode を呼び出して IR を MI form に変換します。続いて MIs を actual machine
code へ convert する targey-specfic な pass である addCodeEmitter を呼び出します。MIs はすでに

です。変換のための x86 コードは lib/Target/X86/X86CodeEmitter.cpp にあります。
MadchineInstr はすでに その opcode と operand とを保持するように package されているので、
わたしたちの除算命令のための special handling はここではありません。

MCInst

When LLVM is used as a static compiler (as part of clang, for instance), MIs are passed down to
the MC layer which handles the object-file emission (it can also emit textual assembly files).
Much can be said about MC, but that would require an article of its own. A good reference is
this post from the LLVM blog. I will keep focusing on the path a single instruction takes.

LLVM は static compiler として使われた場合 (たとえば clang の一部として)、MIs は object-file
emission (これはtextual assembly files の emit も可能です) を処理する MC layer へ pass down
されます。MC について言及できることは多々あるのですが、それをするには独立した article が必要
となります。よい reference は LLVM blog のこの post です。
I will keep focusing on the path a single instruction takes.

LLVMTargetMachine::addPassesToEmitFile is responsible for defining the sequence of actions
required to emit an object file. The actual MI-to-MCInst translation is done
in the EmitInstruction of the AsmPrinter interface. For x86, this method is
implemented by X86AsmPrinter::EmitInstruction, which delegates the work to the
X86MCInstLower class. Similarly to the JIT path, there is no special handling for our
division instruction at this point, and it's treated generically with other
instructions.

sequence of action の定義に対して responsible です。実際の MI-to-MCInst translation は
AsmPrinter インターフェースの EmitInstruction で行われます。
このインターフェースは X86MCInstLower クラスへ作業を delegate します。
JIT path と同様に、この段階ではわたしたちの除算命令を特別にhandling するものはなく、

By passing -show-mc-inst to llc, we can see the MC-level instructions it creates,
alongside the actual assembly code:

llc に -show-mc-inst を渡すことにより

foo:                                    # @foo
# BB#0:                                 # %entry
movl    %edx, %ecx              # <MCInst #1483 MOV32rr
#  <MCOperand Reg:46>
#  <MCOperand Reg:48>>
leal    (%rdi,%rsi), %eax       # <MCInst #1096 LEA64_32r
#  <MCOperand Reg:43>
#  <MCOperand Reg:110>
#  <MCOperand Imm:1>
#  <MCOperand Reg:114>
#  <MCOperand Imm:0>
#  <MCOperand Reg:0>>
cltd                            # <MCInst #352 CDQ>
idivl   %ecx                    # <MCInst #841 IDIV32r
#  <MCOperand Reg:46>>
ret                             # <MCInst #2227 RET>
.Ltmp0:
.size   foo, .Ltmp0-foo

The object file (or assembly code) emission is done by implementing the MCStreamer interface.
Object files are emitted by MCObjectStreamer, which is further subclassed according to the actual
object file format. For example, ELF emission is implemented in MCELFStreamer. The rough path a
MCInst travels through the streamers is MCObjectStreamer::EmitInstruction followed by a
format-specific EmitInstToData. The final emission of the instruction in binary form is, of
course, target-specific. It's handled by the MCCodeEmitter interface (for example
X86MCCodeEmitter). While in the rest of LLVM code is often tricky because it has to make a
separation between target-independent and target-specific capabilities, MC is even more
challenging because it adds another dimension – different object file formats. So some code is
completely generic, some code is format-dependent, and some code is target-dependent.

このオブジェクトファイル (もしくはアセンブリコード)の emission は、実際のオブジェクトファイルの
フォーマットに従ってサブクラス化されている MCStremer インターフェースの実装によって行われます。
たとえば ELF の emission は MCELFStreamer に実装されています。
MCInst travels through the streamers の rough path は
followed by a format-specific EmitInstToData な MCObjectStreamer::EmitInstruction です。
binary form の instruction への final emission はもちろん target-specific です。
その emission は MCCodeEmitter インターフェース (たとえば X86MCCodeEmitter) によって handle されます。
target-independent な capabilities と target-specific な capabilities とを分割しなければならないので
LLVM の残りのコードはしばしば tricky なものですが、MC は異なるオブジェクトファイルという別の次元
を加えているためにより一層 challenging なものになっています。
ですから、一部のコードは completely generic であり一部のコードは format-dependent であり、

Assemblers and disassemblers
アセンブラーと逆アセンブラー

A MCInst is deliberately a very simple representation. It tries to shed as much semantic information
as possible, keeping only the instruction opcode and list of operands (and a source location for
assembler diagnostics). Like LLVM IR, it's an internal representation will multiple possible
encodings. The two most obvious are assembly (as shown above) and binary object files.

MCInst は意図的 (deliberateky) に非常に単純な表現になっていて、

(さらに a source location for assembler diagnostics) のみを保持するよう試みます。
LLVM IR と同様に内部表現は mulitple possible encodings になり得ます。
そのもっとも明らかな二つのエンコーディングは、
アセンブリと (これまでに見たような) バイナリオブジェクトファイルです。

llvm-mc is a tool that uses the MC framework to implement assemblers and disassemblers.
Internally, MCInst is the representation used to translate between the binary and
textual forms. At this point the tool doesn't care which compiler produced the
assembly / object file.

llvm-mc はアセンブラーと逆アセンブラーを実装するのに MC framework を使っているツールです。

この時点ではツールはコンパイラーがアセンブリファイルかオブジェクトファイルのいずれを

[1] 	To examine the AST created by Clang, compile a source file with the -cc1 -ast-dump options.
Clang によって生成された AST を検査するために、-cc1 --ast-dump オプションをつけて
ソースファイルをコンパイルします

[2] 	I ran this IR via opt -mem2reg | llvm-dis in order to clean-up the spills.
spills を clean-up するために -mem2reg | llvm-dis  をつけてこの IR を実行しました

[3] 	These things are a bit hard to grep for because of some C preprocessor hackery employed
by LLVM to minimize code duplication. Take a look at the include/llvm/Instruction.def file
and its usage in various places in LLVM's source for more insight.
これらのことがらは LLVM の code duplication を minimize するためにいくつかの
C プリプロセッサーが hackery employ されているために grep するのが困難です。
include/llvm/Instruction.def というファイルと、その使われ方に注目してください

[4] 	A DAG here means Directed Acyclic Graph, which is a data structure LLVM code generator
uses to represent the various operations with the values they produce and consume.
ここでは DAG は Directed Acyclic Graph で、LLVM code generator が
値を produce and consume するさまざまな operations を表現すのに使うデータ構造です。

[5] 	Which is arguably the single scariest piece of code in LLVM.

[6] 	This is an example of how target-specific information is abstracted to guide the
target-independent code generation algorithm.
これは target-independent な code generation algorithm を guide するために
target-specific な information がどのように抽象化されているかの実例です

[7] 	The code generator performs DAG optimizations between its major steps, such as between
legalization and selection. These optimizations are important and interesting to know
about, but since they act on and return selection DAG nodes, they're out of the focus of
code generator は legalization と selection の間のような major step の間で
DAG optimizations を行います。このような optimizatios がどういったものかを
知ることは重要かつ interesting ですが、その最適化は selection DAG nodes に
対して行って selection DAG nodes を返すので本 article の focus から外れます。

[8] 	When I'm saying "machine code" at this point, I mean actual bytes in a buffer,
representing encoded instructions the CPU can run. The JIT directs the CPU to execute code
from this buffer once emission is over.
ここでわたしが "machine code" と言った場合、
それはバッファーにある CPU が実行可能な encoded instructions である actual bytes を指します。
JIT は once emission is over なこのバッファーから code を実行するように
CPU に指示します。

Related posts:

life by Yudkowsky: transhumanism, singularity
Book review: “Artificial Life” by Steven Levy
Book review: “Man and woman – intimate life” by S. Shnabl
Book review: “Life of Pi” by Yann Martel
SICP section 5.2

This entry was posted on Saturday, November 24th, 2012 at 15:37 and is filed under Compilation.


リンクはご自由にどうぞ

メールの宛先はこちら