ときどきの雑記帖混迷編2011年3月(下旬)

2011年03月31日

■_ The End of Monkeypatching

モンキーパッチ止めようぜ。かな?

The End of Monkeypatching
The End of Monkeypatching

By Xavier Shay / March 29, 2011

Monkey-patching is so 2010. We're in the future, and with Github and Bundler there is now
rarely a need to monkey-patch Ruby code in your applications.

Monkey-patching is the dangerous-yet-frequently-useful technique of re-opening existing
classes to change or add to their behavior. For example, if I have always felt that Array
should implement the sum method, I can add it in my codebase:

モンキーパッチは既存のクラスの変更や振る舞いの追加を行うために
クラスを再オープンする dangerous-yet-frequently-useful technique です。
たとえば仮にわたしがいつもArrayにはsumメソッドが実装されているべきだと感じていたならば
それを自分の codebase で追加できます:

class Array
 def sum
   inject {|sum, x| sum + x }
 end
end

That is a monkey-patch. Of course, when I require activesupport it also adds a sum method to
Array though its version has an arity of one and takes a block. This conflict can cause hard
to track down errors and is why monkey-patching is to be used with caution.

これがモンキーパッチです。もちろん、わたしが activesupprt を require したときにも 
(activesupprt 版の)引数一つとブロックをとる sum メソッドが Array に追加されます。この
衝突は追跡が困難なエラーを引き起こす可能性があります。そしてそれがモンキーパッチングを
注意して使われている理由です。


Thankfully, the spread of this abuse is minimal and most developers understand the risks.
More frequently, monkey-patching is used to quickly fix bugs in existing libraries by
reopening a class and replacing an existing method with an implementation that works
correctly. This is often a fragile solution, relying on on sometimes complex techniques to
override exactly the right bit of code, and also on the underlying library not be refactored.

ありがたいことに、spread of this abuse は最小限であり、大半の developer はモンキーパッ
チングを使うことのリスクを理解しています。より頻繁には、モンキーパッチングは既存のライ
ブラリにあるバグをクラスの再オープンして既存のメソッドを正しく動くコードに差し替えるこ
とでさっさとバグを修正するのに使われています。これは大抵の場合 fragile solution (不安
定な解決策) であり、本当に正しい位置でコードを上書きするために複雑なテクニックに依存し
ており、またライブラリがリファクタリングされないことにも依存しています。

In the dark ages when it was troublesome to own or release your own versions of gems, this was
the only cheap solution available. Nowadays though, a modern Ruby application has access to a
far easier and more robust solution: fork the offending code, fix it at the source, and set up
your application dependencies to use your new code directly. All without having to package a
new gem!

自分用のgemを持ったりリリースをすることに問題があった暗黒時代には、こういった
モンキーパッチを駆使することが行使できる唯一の cheap solution でした。
しかし現在、modern なRubyアプリケーションは格段に簡単に、かつ頑強な solution を
とれるようになっています。問題のあるコードを fork して、ソースコードを修正してから
それを自分のアプリケーションに直接新しいコードをセットアップするのです。
すべてはパッケージやgemを持つ必要なく行えます。

The first few steps of this process are mostly self-explanatory, but I have documented them
below anyway. If you are already old hat at this stuff, feel free to skip directly to step 4.

このプロセスの最初の数ステップは mostly self-explanatory ですが、
以下に示したようにドキュメントを残しました。
もしすでにこのことについて知っているのであれば、ステップ4まで飛ばしてしまっても
問題ありません。


Step 1: Fork the Library
ライブラリを forkする

It is rare to find a Ruby gem or library that isn't on Github these days. After locating the
library, always check the network graph to try and find other popular forks. Often the problem
you are trying to solve has already been fixed by another developer. In that case, you can skip
straight to step 3. Otherwise, fork the code base to your own GitHub account.

今日、Github にはない Ruby gem やライブラリを見つけるのは rare なことです。

Step 2: Make Your Changes
変更を行う

Clone your fork and make whatever changes you need. If you are feeling generous, add 
an appropriate test to the code base as well so it can be contributed back to the 
original fork, but as long as you have a test somewhere (such as in your main app) for 
the desired behavior you will be fine.

Step 3: Change Your Gemfile
Gemfile を変更する

Point your Gemfile at the new code:

# Gemfile
# From this
gem 'rails'

# To this
gem 'rails', :git => 'git://github.com/xaviershay/rails', :branch => 'important-fix'

And reinstall your gems by running bundle.

bundle を実行して gems をインストールします。

Step 4: Document

This step is important. There is no excuse for skipping it. You need documentation in 
three places:

  1. A note at the top of the README in your fork, documenting the changes. Any developer
     an stumble across a public fork, and there is nothing more frustrating than trying to
     figure out whether a fork already solves your problem. At the very least, a “here be
     dragons” note will be appreciated.

  2. The place in your code base that depends on the fork. You can expect other developers
     to be familiar with rails and the standard gems. They won’t be familiar with the
     behavior of your changes.

  3. The Gemfile. Make a note above your gem line as to why a fork is required. 
     Provide enough information that a future developer will know when or if it would be 
     appropriate to upgrade or switch back to the main gem. Here are some real examples 
     from some of my projects:

# An experimental fix for memory bloat issues in development, if it works
# I will be patching to core.
gem ...

# 1.1 requires rubygems > 1.4, so won't install on heroku. This fork removes
# that dependency, since it is actually only required for the development
# dependencies.
gem ...

# Need e86f5f23f5ed15d2e9f2 in master and us to upgrade to dm-core 1.1
# before we switch back. Should be in 1.1.1 release.
gem ...

略
Copyright c 2006～2011 Peter Cooper

■_

Non-trivial monads in a functional programming language without closures (functions are first class but can't be partially applied)? : haskell

This is a generic FP lang design question and not really Haskell, but r/functional's 
latest post is 1 year ago so I put it here instead.

I'm designing a language with a statically-allocatable restriction, based on some 
research about using FP langs for behavioural hardware synthesis. To summarize, the 
limitations imposed in order to make behavioural synthesis tractable are:

■_

A Quick Look at the Rust Programming Language

2011-03-31

A Quick Look at the Rust Programming Language

The Rust Programming Language is a systems programming language being developed by
Mozilla. It was announced last year and has seen quite a bit of development since then.

Rust は Mozzila によって開発されているシステムプログラミング言語です。
その開発は昨年アナウンスされており、以後開発が続いていました。

I've only been lightly following the development over the past year but recently
decided to spend a bit more time looking at it. The following is a look at the
language and implementation from someone who isn't heavily involved in the
development so don't take anything I write as gospel. Hopefully I don't get too many
things wrong.

The Rust Language FAQ lists the following summary of features:
Rust のFAQ では以下のような機能のサマリをあげています:

* Memory safe. No null pointers, wild pointers, etc. Automatic storage management.
メモリー安全。ナルポインターや wild ポインターといったものはありません。
自動的にメモリー管理されます。

* Mutability control. Immutable by default. No shared mutable state across tasks.
Mutabliity の制御。デフォルトでは immutable であり、タスクをまたがって共有される
mutable state はありません、

* Dynamic execution safety: task failure / unwinding, trapping, logging. RAII / dtors.
安全な動的実行

* Typestate system: ability to define complex invariants that hold over data structures.

* Explicit memory control. Layout and allocation control. Interior / value types.

* Very lightweight tasks (coroutines). Cheap to spawn thousands-to-millions.
非常に軽量なタスク(コルーチン)。

* Stack iterators (effectively lambda-blocks w/o heap allocation).
スタックイテレーター (ヒープの割り当てを行わない効率の良いラムダブロック)

* Static, native compilation. Emits ELF / PE / Mach-o files.

* Direct and simple interface to C code (switch stacks and call, ~8 insns).
Cのコードに対する直接的で単純なインターフェース
(スタックの切り替えと呼び出しで最大8命令)

* Multi-paradigm. pure-functional, concurrent-actor, imperative-procedural, OO.

* First class functions with bindings.

* Structurally-typed objects (no nominal types or type hierarchy).

* Multi-platform. Developed on Windows, Linux, OSX.
マルチプラットフォーム

* UTF8 strings, assortment of machine-level types.
UTF8 文字列

* Works with existing native toolchains. GDB / Valgrind / Shark / etc.
既存の native ツールチェインと一緒に動作する

* Practical rule-breaking: can break safety rules, if explicit about where and how.

略
Conclusion

Rust is an interesting language and there's quite a bit more to it than I've covered
here. I just picked random features to try. I'm looking forward to rustc being more
complete and trying out some of the language features in more ‘real world' examples
to get a feel for it.

Rust はわたしがここで紹介したよりもずっと面白い言語です。
わたしは単にランダムにいくつかの機能を取り上げようとしただけです。

■_ オススメ本

推薦図書/必読書のためのスレッド 61 

387 デフォルトの名無しさん [sage] 2011/03/31(木) 21:05:15.81 ID: Be:
    >>385
    どういたまして＾＾

    未だに良書と出会って無いんだが、アルゴリズムはコード丸暗記じゃ無く、理論的に覚えた方が良い

    個人的にはhaskellのアルゴリズム辞典が出て欲しい
    理論的にも理解し易く(素人でもアルゴリズムの原理を理解しやすい)、実際に動かせるコードになるからね
    (実は自分が手元に置いておきたい)

    理論的に理解できれば、新しい言語への移植も容易になる

    命令の最適化よりも、アルゴリズムによる最適化の方がパフォーマンスに大きく貢献する。と言うのは覚えておいて欲しい

388 デフォルトの名無しさん [sage] 2011/03/31(木) 21:40:01.63 ID: Be:
    >>387
    > 個人的にはhaskellのアルゴリズム辞典が出て欲しい

    辞典ではないが、アルゴリズムの解説本は出てる

    Pearls of Functional Algorithm Design

    これなんか、関数型でのアルゴリズムの考え方を
    Haskell を使ってかなり分かりやすく解説してる

    各トピックで、まず最初に仕様を関数で表現したり、
    拙いが直感的なアルゴリズムを解説してから、
    徐々に効率を上げていく解説方法なんで、ワクワクしながら読める
    またその効率が、なぜ、どのくらい上がるのかの説明も理論的だ 

389 デフォルトの名無しさん [sage] 2011/03/31(木) 21:47:25.30 ID: Be:
    >>388
    洋書の方か!!
    最近探してなかった
    thanks

    日本語に翻訳されたら、珠玉の一冊として紹介されるだろうに。。。

390 デフォルトの名無しさん [sage] 2011/03/31(木) 21:51:33.83 ID: Be:
    >>388
    kindleじゃ出てないんだね。。。
    いや、買うけど

これか。 Amazon.co.jp： Pearls of Functional Algorithm Design: Richard Bird: 洋書

■_ 制限

Twitter / @木村屋: Cの標準関数のヘッダファイルの名前が変に短いのは、処 ...

Cの標準関数のヘッダファイルの名前が変に短いのは、処理系の制限が原因だったのだろうか。

Twitter / @M教授: @kimuraya osのファイルシステムの最大名前 ...

@kimuraya osのファイルシステムの最大名前長の制限

これどうなんだろう。 FORTH の「処理系の名前」の長さがそれに引きずられたとか聞いた覚えはあるけど、関数名だしなあ。ライブラリを一関数一ファイルで作っていたとしても、 UNIXって6文字とかに制限はなかったような (System V で14文字とかいうのを見た覚えはある)。誰か情報を持ってたら教えてください。

■_

2011年03月30日

■_

今月の皇帝陛下
ヴァンデの反乱とか知らなかったヴァンデの反乱 - Wikipedia

■_

最初に学ぶべき言語とは何か?

FAQ: What programming language should I learn first?

Posted on Mar 28th, 2011 in FAQs, Programming Languages, Python

There are hundreds of different programming languages out there. As a newcomer you can
ignore the fact that most of them exist. However, even if we narrow the list to just a
dozen mainstream languages, deciding on what programming language to learn first can
be a daunting task. You might find yourself asking, should I learn C, C++, Java, C#,
or PHP first? If you ask ten programmers this question, you'll probably hear ten
different answers. Here is my take.

Much as with human languages, programming languages are used to communicate.
Interestingly they still involve communication between people, whether other
programmers will end up reading/modifying/enhancing your code or you'll do at a later
point in time. Unlike natural languages however, programming languages are
unequivocally understood by computers, thanks to the aid of interpreters, compilers,
and similar types of software.

略
As such, my suggestion would be to start with Python, and use it as a tool to learn
the general craft of programming. Learning Python is fun, easy, and useful. You'll be
able to use it for a wide array of projects in several environments (scripting, web,
scientific research, etc…).

わたしの提案は、Python から始めること。そしてプログラミングの general craft を学ぶための
ツールとして Python を使つというものです。Python を学ぶことは楽しく、簡単で、有用です。
Python を学ぶことであなたは、いくつかの環境 (スクリプティング、web、科学研究などなど)
における wide array of project のために Python を使えるようになるでしょう。

There are a variety of free tutorials on the web, but if you want a more
rigorous/systematic/academic introduction, I highly recommend “Python Programming: An
Introduction to Computer Science (2nd Edition)” (USA | UK | Canada).

web 上には、様々な無料で手に入るチュートリアルがあります。しかしあなたがより
rigorous/systematic/academic な introduction を求めているのなら、
わたしは “Python Programming: An Introduction to Computer Science (2nd Edition)”
を強くお薦めします。

Once you have learned the fundamentals of programming, have a decent command of the
Python language, and have gained some experience with practical Python projects, you
should be better armed to evaluate and pick up other languages and frameworks based on
the projects you intend to develop or contribute to in the future (Open Source
projects are awesome for this purpose).

プログラミングの基礎 (fundamentals) を学び、

■_ これはない

OKWaveから丸投げネタ。

awkでのsh処理について | OKWave

awkでのsh処理について

HP-UX環境、UNIXです。

1行目の11カラム目にOUTが含まれているかつ2行目の11カラム目にINが含まれている行だけ
ファイルに出力するという処理を以下のように考えたんですが、うまくいきません。

awk'{m == NR % 2}
m==1{if($11~ "OUT")} &&
m==0{if($11~ "IN") print $0} ' [ファイル名]


文法的に誤っていますでしょうか？？
回答宜しくお願い致します。

質問者が選んだベストアンサー

べたなロジックですが

cat ファイル名 | awk '\
BEGIN { FILE=-1 }
{
　if (substr($0,11,3)=="OUT") {
　　LINE=NR
　}
　if (NR==(LINE+1) && substr($0,11,2)=="IN") {
　　print $0
　}
}
END {}' > 出力ファイル名

ではいかがでしょうか。(インデントのために全角空白が入っています)

「sh処理」は何を言っているのか分かりません。

ANo.1

実際やったらエラーになるでしょ。
そのエラーメッセージは読んでますか?

とりあえず

> awk'{m == NR % 2}

kと'の間がつまっているのは元からですか?貼り付けたときに変わていまったのですか?

> m==1{if($11~ "OUT")} &&

この&&はなんでしょう?


なお、2行1セットでやりたいのなら、このやりかたでは難しいでしょう。

色々とすいません。kmeeさん。

kと'の間がつまっているのは
⇒貼り付け時です。

メッセージは
⇒syntaxエラー系のメッセージですね。内容は理解出来ませんでした。


2行1セットのやり方はご存知ですか？？

投稿日時 - 2011-03-29 20:03:10

なんで getline を使わんとですか。

■_

2011年03月29日

■_

イライラが止まらない～

■_ Python の dictionary

や、まあ。ハッシュテーブルなんですが。日曜日に参加したソースコードリーディングで dictobject のところを読んで非常に面白いものだということを知りました。ああ、今日もまとめを書いている時間ががががががおかいがー♪

まず、テーブルの大きさが素数ではありませんでした。以前からテーブルの大きさが素数ではなくてもハッシュ関数がきちんとしたものであれば問題ないという話は何度か目にしていましたが、実際の処理系で2のべきの大きさを使っているのをみたのは多分初めてな気が。そしてもうひとつの興味深い点として、ハッシュ値の衝突の対処としてチェーン法を使っておらず、オープン~~ハッシュ~~アドレッシング(もしくはクローズドハッシュ)を使っているということです。ドキュメントがあったり、コメントにいろいろ書かれているので読んでみると良いかも。

/*
Major subtleties ahead:  Most hash schemes depend on having a "good" hash
function, in the sense of simulating randomness.  Python doesn't:  its most
important hash functions (for strings and ints) are very regular in common
cases:

  >>> map(hash, (0, 1, 2, 3))
  [0, 1, 2, 3]
  >>> map(hash, ("namea", "nameb", "namec", "named"))
  [-1658398457, -1658398460, -1658398459, -1658398462]
  >>>

This isn't necessarily bad!  To the contrary, in a table of size 2**i, taking
the low-order i bits as the initial table index is extremely fast, and there
are no collisions at all for dicts indexed by a contiguous range of ints.
The same is approximately true when keys are "consecutive" strings.  So this
gives better-than-random behavior in common cases, and that's very desirable.

OTOH, when collisions occur, the tendency to fill contiguous slices of the
hash table makes a good collision resolution strategy crucial.  Taking only
the last i bits of the hash code is also vulnerable:  for example, consider
the list [i << 16 for i in range(20000)] as a set of keys.  Since ints are
their own hash codes, and this fits in a dict of size 2**15, the last 15 bits
 of every hash code are all 0:  they *all* map to the same table index.

But catering to unusual cases should not slow the usual ones, so we just take
the last i bits anyway.  It's up to collision resolution to do the rest.  If
we *usually* find the key we're looking for on the first try (and, it turns
out, we usually do -- the table load factor is kept under 2/3, so the odds
are solidly in our favor), then it makes best sense to keep the initial index
computation dirt cheap.

The first half of collision resolution is to visit table indices via this
recurrence:

    j = ((5*j) + 1) mod 2**i

For any initial j in range(2**i), repeating that 2**i times generates each
int in range(2**i) exactly once (see any text on random-number generation for
proof).  By itself, this doesn't help much:  like linear probing (setting
j += 1, or j -= 1, on each loop trip), it scans the table entries in a fixed
order.  This would be bad, except that's not the only thing we do, and it's
actually *good* in the common cases where hash keys are consecutive.  In an
example that's really too small to make this entirely clear, for a table of
size 2**3 the order of indices is:

    0 -> 1 -> 6 -> 7 -> 4 -> 5 -> 2 -> 3 -> 0 [and here it's repeating]

If two things come in at index 5, the first place we look after is index 2,
not 6, so if another comes in at index 6 the collision at 5 didn't hurt it.
Linear probing is deadly in this case because there the fixed probe order
is the *same* as the order consecutive keys are likely to arrive.  But it's
extremely unlikely hash codes will follow a 5*j+1 recurrence by accident,
and certain that consecutive hash codes do not.

テーブルの大きさを2のべきにして、衝突したときの次候補の位置を j = ((5*j) + 1) mod 2**i という式を使って求めているのですが、この式を使うと同じ位置を二度以上通ることなくすべての要素を回って元の位置に帰ってくるらしいのですね(数学的な証明とかわかりません)。

def travel(i)
 j=0
 ary=[]
 begin
   j = ((5*j) + 1) % 2 ** i
   ary.push(j)
 end until j==0
 ary
end

2.upto(20) do |i|
 pat = travel(i)
 puts "length=#{pat.length}"
 puts pat.join('->') if i<9

 p pat.sort == [*0..2**i-1]
end

とこんなスクリプトで試してみました。

length=4
1->2->3->0
true
length=8
1->6->7->4->5->2->3->0
true
length=16
1->6->15->12->13->2->11->8->9->14->7->4->5->10->3->0
true
length=32
1->6->31->28->13->2->11->24->25->30->23->20->5->26->3->
 16->17->22->15->12->29->18->27->8->9->14->7->4->21->10->19->0
true
length=64
1->6->31->28->13->2->11->56->25->62->55->20->37->58->35->
 48->49->54->15->12->61->50->59->40->9->46->39->4->21->42->
 19->32->33->38->63->60->45->34->43->24->57->30->23->52->
 5->26->3->16->17->22->47->44->29->18->27->8->41->14->7->
 36->53->10->51->0
true
length=128
1->6->31->28->13->66->75->120->89->62->55->20->101
->122->99->112->49->118->79->12->61->50->123->104->
9->46->103->4->21->106->19->96->97->102->127->124->
109->34->43->88->57->30->23->116->69->90->67->80->
17->86->47->108->29->18->91->72->105->14->71->100->
117->74->115->64->65->70->95->92->77->2->11->56->
25->126->119->84->37->58->35->48->113->54->15->76->
125->114->59->40->73->110->39->68->85->42->83->32->
33->38->63->60->45->98->107->24->121->94->87->52->
5->26->3->16->81->22->111->44->93->82->27->8->41->
78->7->36->53->10->51->0
true
length=256
1->6->31->156->13->66->75->120->89->190->183->148->
229->122->99->240->177->118->79->140->189->178->123->
104->9->46->231->132->149->234->147->224->97->230->
127->124->109->34->171->88->185->158->23->116->69->
90->195->208->17->86->175->108->29->146->219->72->105->
14->71->100->245->202->243->192->193->198->223->92->
205->2->11->56->25->126->119->84->165->58->35->176->
113->54->15->76->125->114->59->40->201->238->167->
68->85->170->83->160->33->166->63->60->45->226->107->
24->121->94->215->52->5->26->131->144->209->22->111->
44->221->82->155->8->41->206->7->36->181->138->179->
128->129->134->159->28->141->194->203->248->217->62->
55->20->101->250->227->112->49->246->207->12->61->50->
251->232->137->174->103->4->21->106->19->96->225->102->
255->252->237->162->43->216->57->30->151->244->197->
218->67->80->145->214->47->236->157->18->91->200->
233->142->199->228->117->74->115->64->65->70->95->220->
77->130->139->184->153->254->247->212->37->186->163->
48->241->182->143->204->253->242->187->168->73->110->
39->196->213->42->211->32->161->38->191->188->173->98->
235->152->249->222->87->180->133->154->3->16->81->150->
239->172->93->210->27->136->169->78->135->164->53->10->
51->0
true
length=512
true
length=1024
true
length=2048
true
length=4096
true
length=8192
true
length=16384
true
length=32768
true
length=65536
true
length=131072
true
length=262144
true
length=524288
true
length=1048576
true

■_

Coq とか Agda とか見えたので読んでみようと思ったものの

λ Tony's blog λ ≫ Blog Archive ≫ A brief point on static typing
A brief point on static typing

This post also incidentally attempts to justify my answer to a question that I regularly
encounter, what is your favourite programming?


   It's not possible to write bug-free programs.
   (バグのないプログラムを書くことは不可能である。)


This particular myth is quite popular, and its recent inclusion in a discussion I had 
has prompted me to write this post. The profound fact of this matter is that this 
statement is “as false as you can get.” In other words, there exists a formal proof 
of its negation, and subsequent counter-examples. It has also been my experience that 
belief in this myth has degenerative practical implications.

この particular myth はとてもよく知られていて、
最近わたしが行った議論にあったものはわたしにこのpostを書かせる原動力となりました。
The profound fact of this matter is that this statement is “as false as you can get.”
言い換えれば
there exists a formal proof of its negation, and subsequent counter-examples.
It has also been my experience that 
belief in this myth has degenerative practical implications.


To emphasise the point, there exist programming languages for which it is impossible 
to write an incorrect program. What does it mean for a program to be correct? It means 
the program terminates. Such languages include (in order of my decreasing experience 
with them): Coq, Agda, Epigram, Isabelle.

ここで強調すべきは、正しくないプログラム (incorrect program) を書くのが不可能な
プログラミング言語が存在しているということです。
プログラムが正しい状態にあるとはどういうことでしょうか?
それはそのプログラムが終了するということです。
正しくないプログラムを書くことのできないそのような言語には Coq、Agda、Epigram、Isabelle
といったものがあります。

略

■_

2011年03月28日

■_

調子が悪いときはとっとと寝る。

■_

Scala と Javaのアンチパターン

NfNitLoop: Scala vs. a Java Anti-Pattern

Scala vs. a Java Anti-Pattern
Scala と Java のアンチパターン

In the last few months I've been playing with Scala, a new(ish) programming language 
that runs on the JVM. The problem with learning a fun new language is that you start 
wanting to use its features, even when you're still using a different language. (In 
this case, Java.) But I came across a case this past week that illustrated this so 
well that I actually wanted to write it down to share.

ここ数ヶ月の間、わたしはJVM上で動作する新しい(っぽい)プログラミング言語である
Scala を試していました。楽しい新言語を学ぶときの問題は、まだ異なる言語
(この場合は Java) を使っている場合にさえその機能を使いたくなってしまうということです。
But I came across a case this past week that illustrated this so 
well that I actually wanted to write it down to share.


The Anti-pattern
アンチパターン

It's common to need to return more than one item. Maybe a name and date, or an ID and 
a status, or a result and an error code. Languages like Java don't really make this 
easy -- the return statement only takes one argument.

二つ以上のアイテムを返す必要性は一般的なものです。
それはおそらく、名前と日付、ID とステータス、結果とエラーコードといったものであるでしょう。
Java のような言語は複数の値を返すことが容易ではありません。
リターン文はただひとつの引数しかとらないからです。

This often gets solved by returning some type that approximates a tuple. For example:

これはほとんどの場合、タプルのような型を返すことで解決されます。
たとえば

// ...
TuplePair<Integer,ErrorType> pair = foo.computeSomeValue();
if (checkErrorConditionOK(pair.getSecond())) return pair.getFirst();
else throw new AppropriatelyNamedExceptionTypeException(pair.getSecond().getErrorMessage());
// ...

This is pretty ugly. But foo has returned a couple values to us. And with Java 
generics, we even get some compile-time type checking -- pair.getFirst() is an Integer, 
and pair.getSecond() is an ErrorType. But that's a pretty ideal case with a nice 
descriptive type parameter. What if our parameters are of the same type?

これはとても醜いものですが、とにかく foo は二つの値を返します。
そして Java のジェネリクスを一緒に使うことでコンパイル時の型チェックさえ手に入れられます。
pair.getFirst() は Integer であり、pair.getSecond() は ErrorType です。
しかしこの場合に理想的なのは descriptive な型パラメータです。

// ...
TuplePair<Integer,Integer> pair = foo.computeSomeValue();
if (checkErrorConditionOK(pair.getSecond())) return pair.getFirst();
else throw new AppropriatelyNamedExceptionTypeException("Got error code: " + pair.getSecond());
// ...

From this code, I have no way of knowing that the second item in the tuple is an error 
code. I can infer it from the way the code is written. But maybe whoever wrote this 
code got the parameters mixed up. If I'm reading through this code (debugging or doing 
a thorough code review), I now have to open up foo's implementation to see what 
actually gets returned in that tuple.

このコードからはわたしがタプルの二番目の要素がエラーコードであるということを知る手段は
ありません。わたしはそのコードが書かれた手順からそれがエラーコードであることを推測でき
ます。
But maybe whoever wrote this code got the parameters mixed up.
仮にわたしがこのコードを(デバッグやコードレビューなどで)読んだとすると、
foo の実装を開いてタプルで返しているものが実際にはなんであるのかを
確認しなければならないでしょう。


In this case, our TuplePair is barely better than an array.

この場合、TuplePair は配列よりもずっと好ましいものです。

Why the Anti-pattern Persists
なぜアンチパターンが持続するのか

It's easy to understand why I keep seeing this pattern in Java, though. There's a lot 
of boring boilerplate code that has to be written to work around this problem. To make 
the code more meaningful and readable, and to allow for easier refactoring in the 
future, you should assign a name to each thing you want to return. (No, you don't want 
to do that with a Map. That's an anti-pattern for another blog post.) The best way to 
do this is with a new class.

とはいえなぜこのパターンを Java で目撃し続けているのかを理解するのは容易です。
There's a lot of boring boilerplate code that has to be written to work around this problem.
この問題を回避するために書かなければならない退屈な boilerplate code があまりにもたくさ
んあります。このコードをより meaningful かつ読みやすいものにするために、また、将来にお
けるリファクタリングを容易なものにするために、返したいと考えているものそれぞれについて
名前を割り当てるべきなのです
(No, you don't want to do that with a Map. That's an anti-pattern for another blog post.)。
これを行うベストの手段は新しいクラスを使うことです。


In Java, writing a new class is pretty heavyweight. If you want to use proper 
encapsulation, your new class might look like this:

Java では新しいクラスを書くのはとても大ごと (heavyweight) です。
もし適切な encapsulation を使いたいのなら、新しいクラスは次のようになるでしょう:


public class ComputedValue
{
  private int value;
  private int errorCode;

  public ComputedValue(int value, int errorCode)
  {
     this.value = value;
     this.errorCode = errorCode;
  }

  public getValue() { return value; }
  public getErrorCode() { return errorCode; }
}

Since ComputedValue is a public class, you have the option of making it a static inner 
class, or going the more common route and creating an entirely new file. But at least 
now our code is a bit more readable:

ComputedValue はパブリックなクラスなので、これを static な inner クラスにするか
より一般的なルートを行って完全に新規のファイルを作成するかの選択肢があります。
しかし少なくとも現在わたしたちのコードは多少読みやすいものです:


// ...
ComputedValue computed = foo.computeSomeValue();
if (checkErrorConditionOK(computed.getErrorCode()))
    return computed.getValue();
else throw new AppropriatelyNamedExceptionTypeException("Got error code: " + computed.getErrorCode())
// ...

What Scala Offers
Scala が offer するもの

Given the above option, it's not surprising that people keep creating the TuplePair class.
Frankly, it's a pain in the ass to write new Java classes for every case like this. The
thinking is “Write TuplePair once and re-use it so we don't have to do it again,” even
though it leads to useless names like getFirst() and getSecond().

上記の選択肢を与えられたときに、Tupler クラスを作り続ける人がいることは
不思議なことではありません。
率直に言って、このような自体の度ごとにあたらしいJavaのクラスを記述することは
苦痛を伴うものです。その思考は
“Write TuplePair once and re-use it so we don't have to do it again,”
even though it leads to useless names like getFirst() and getSecond().


Scala helps solve this class of problems by making creating these types of classes 
dead simple:

Scala はこういったクラスの型の作成をとことんシンプルにすることで
この問題のクラスの解決を手助けします。

class ComputedValue(val value: Int, val errorCode: Int)

This single line gives you everything that the above Java implementation did: A type 
with “value” and “errorCode” properties, with a constructor for easily creating 
new instances. And while you didn't have to write any boilerplate code to provide 
encapsulation, it's there, implemented automatically by the Scala compiler.

この一行は前述の Java での実装が行っているものすべてを与えています:
“value”と“errorCode”というプロパティーと
新しいインスタンスを生成できるコンストラクターを持った型です。
And while you didn't have to write any boilerplate code to provide encapsulation,
it's there, implemented automatically by the Scala compiler.


Classes in Scala are public by default, and they don't have to be declared in separate files
or as static inner classes. That makes the barrier to entry to writing better code a single
line in a file you already have open. The Scala version of the above snippet becomes:

Scala におけるクラスはデフォルトで public であり、別のファイルで宣言したり static な
inner class とする必要はありません。そういったものは既にオープンしているファイルで
一行の better code を書く妨げになります。
先のコード片の Scala版はこうなります:

// ...
val computed = foo.computeSomeValue();
if (checkErrorConditionOK(computed.errorCode)) return computed.value;
else throw new AppropriatelyNamedExceptionTypeException("Got error code: " + computed.errorCode)
// ...

But Scala gives you even one more nice feature in this simple case. So far, we've 
only been looking at code that consumes a ComputedValue instance. But what about the 
code that creates it?

しかし Scala はこの単純なケースにおいて、さらにもう一つniceな機能を与えています。
So far, we've only been looking at code that consumes a ComputedValue instance.
しかし生成されたコードはどうでしょうか?

// Java:
public ComputedValue computeValue()
{
    return new ComputedValue(0, 42);
}

Even with our new ComputedValue class, this code still has the same problem as our 
TuplePair! Someone reading this code can't tell which is the value and which is the 
error code. To be sure, we would have to open up ComputedValue and look at its 
constructor parameters. But we're still not sure if the developer knew the proper 
parameter order when he wrote that line of code. We can't see his intention from the 
information in the call to the constructor. I use the following pattern to make this 
more clear in Java:

新しい ComputedValue クラスを使ってさえ、このコードにはまだ TuplePair と同じ問題があります!
Someone reading this code can't tell which is the value and which is the error code.
誰かがこのコードを読んでも、どちらが value でどちらがエラーコードなのか区別できません。
To be sure,
ComputedValue を open してそのコンストラクターに対するパラメーターを確認しなければなりません。
But we're still not sure
if the developer knew the proper parameter order when he wrote that line of code.
コンストラクターの呼び出しから得られる情報からは、コードの書き手の意図を読み取れません。
Java においてこれをより明確にするためにわたしは次のパターンを使っています:

public ComputedValue computeValue()
{
    final int value = 0;
    final int errorCode = 42;
    return new ComputedValue(value, errorCode);
}

The above at least makes my intentions clear. But to check whether my argument order 
matches the argument order expected by the constructor, you'll have to look at 
ComputedValue's implementation. How does Scala improve on this?

上記のものは少なくともわたしの意図 intentions を明確にしています。しかし引数の並びがコンス
トラクターが予期しているものとマッチしているかどうかを確認するには、ComputedValue の実
装を確認必要があるでしょう。Scala はこれをどのように改良しているのでしょうか?


def computeValue() = new ComputedValue(value = 0, errorCode = 42)

Again, a single line of Scala replaces several lines of Java. The code here is not 
only more concise, but more explicit. The return type of computeValue() is inferred by 
the result of the method's implementation on the right-hand side. And by using named 
arguments, we not only make our intention clear to the reader, but also to the 
compiler. Scala will check at compile-time that ComputedValue's constructor takes “value”
and “errorCode” parameters, and will even re-order our paramters for us if we got them
in the wrong order.

繰り返しますが、Scala の一行は Java の複数行を置き換えます。ここで挙げたコードは一貫性
があるだけでなくより explicit でもあります。computeValue() の戻り値の型は、その右辺に
あるメソッドの実装の結果に影響を受けます。また、名前つき引数を使うことでリーダーに対し
て意図を明確にするだけでなくコンパイラーに対しても意図を明確にしています。Scala はコン
パイル時に ComputedValue のコンストラクターが“value”と“errorCode”のパラメーターを
取ることをチェックしていて、加えて間違った順序でそれらを渡したときにパラメーターの 
re-order さえするでしょう。

These are some of the most basic features that Scala offers, and yet they're the ones 
that come up in day-to-day work over and over again. By getting tedious boilerplate 
out of the way, Scala lets coders write better code without worrying about making more 
work for themselves.

Scala が提案する最も基本的な機能がいくつかあり、


Posted by Cody Casterline at 6:56 PM

■_

んー、このくらいならという気もするけど染まっちゃってるせいですかね＞strcpy

Code 2 Read » Blog Archive » Writing Code (to Read)

Writing Code (to Read)
(読むために) コードを書く


I wanted to start my blog with a post about strcpy implementation in C. This is an 
oversimplified example of the underlying point I want to make with my blog, but an 
example I wanted to show nonetheless. I first happened upon the differing possible 
implementations of strcpy in an article about C that I found on reddit. It had been a 
few months since I'd written any C code and I was preparing for job interviews for an 
internship. I wanted to brush up on C and I noticed the article. The article was 
phenomenal and really helped refresh my memory. I can't seem to find the article; if 
anyone happens to have a link please share it in the comments.


Here's an implementation of strcpy:
以下の例は strcpyの実装です:

static void strcpy(char destination[], const char source[])
{
    while(*destination++ = *source++);
}

■_

3. Free

Java tests are a kind of “blacklisting” approach to governing implementations. Had
we not added testOneIsNotZero, we could have returned PlusZero and the compiler would
have been happy. In a free category, relations are “whitelisted:” an implementation
must not have relations (constraints) other than the ones that are implied by the
definitions.

Java のテストは実装を governing するための  “blacklisting” アプローチです。
Had we not added testOneIsNotZero, we could have returned PlusZero and the compiler would have been happy.
free category では relations は “whitelisted:”です。
実装は定義により implied される一つを除いて
relations (constraitnts) を持ってはいけません。


   * “The free category” has no objects and no morphisms, because there aren't any
      specified.
      free category はオブジェクトも morphisms も持ちません。
      なぜなら、一切 specified されていないからです。

   * “The free category on one object N” has one object and one morphism, the identity 1_N.
      The morphism has to be there, because the definition of category says that every object
      has to have an identity morphism, but we can't put in any other morphisms.

      “The free category on one object N”は一つのオブジェクトと一つの morphism を持ちます。
      その morphism は存在していなければなりません。なぜなら、他の morphisms を
      put することができないからです。

   * “The free category on the directed graph G”
      有効グラフ G 上の free category は

                          A -f-> B
                          |    /
                      G = h   g
                          |  /
                          V L
                          C

     has
         o three objects A, B, C and
           三つのオブジェクトA、B、C および

         o seven morphisms:
               + 1_A:A \to A,
               + 1_B:B \to B,
               + 1_C:C \to C,
               + f:A \to B,
               + g:B \to C,
               + h:A \to C, and
               + (g \circ f):A \to C.
           以上七つの morphisms を持っています。

     The three vertices and the three edges become objects and morphisms, respectively. The
     three identity morphisms and (g \circ f) are required to be there because of the
     definition of a category. And because the category is free, we know that (g \circ f) \ne h.

     三つの vertices と三つの edges は対応したオブジェクトと morphisms になります。
     この三つの identity morphsms と (g \circ f) は圏の定義によって要求されます。
     またこの圏は free であるので、(g \circ f) \ne h であることがわかります。

     For abritrary directed graphs, the free category on the graph has the vertices of the
     graph as objects and the paths in the graph as morphisms.

     任意の有効グラフのために、そのグラフ上の free category はオブジェクトとしての
     vertices of the graph と morphisms としての paths in the graph を持っています。

   * The parity monoid is completely defined by “the free category on one object N, one
     morphism \mbox{PlusOne}:N\to N, and one relation \mbox{PlusOne} \circ \mbox{PlusOne} = 1_{N}.”

   * If we leave out the relation and consider “the free category on one object N and one
     morphism \mbox{PlusOne}:N\toN,” then we get the natural numbers under addition: since
     we didn't specify a point at which repeated applications of PlusOne “wrap around” back
     to zero, the free category has no such constraint. To add the number n, we form
     \underbrace{\mbox{PlusOne} \circ \mbox{PlusOne} \circ \cdots \circ \mbox{PlusOne}}_{n \mbox{ copies}}.

   * Exercise: what is the common name for “the free category on one object N and two morphisms
     \mbox{ConcatenateZero}:N\to N, \mbox{ConcatenateOne}:N\to N?” What should the identity
     morphism be named?

     問題 common name はなにか

■_

2011年03月27日

■_

日曜日夜のこの時間がない感ときたら。

積読解消。プロジェクトのリーダーの人もそうでない人にもオススメ。 100点満点、減点評価を止めようというのは同意しますが、青天井加点評価へどう移行するのか気にはなったり (「成果主義」の二の舞になりそうな気が)。

積読のまま行方不明になっていたのを再発見したので読んだ。「教えすぎてはいけない」というのはそうなんでしょうね。とはいえ、教わるほうも雛鳥がえさをねだるごとく教えてくれくれいう人が多い気もします。

CPython 3.2 ソースコードリーディング - [PARTAKE] に参加してきました。いろいろメモ書きやらしたのですが時間ががが。とりあえず一言。門番?の警備員さん怖かった。

OS-9 (MacOSぢゃないよ)ってソースコードは今も読めないのだっけ。 CP/Mは公開されたような覚えがあるんだけど。 OS-9 - Wikipedia CP/M - Wikipedia

■_ マ板から

Rubyist とか Pythonista とか聞くけど、ほかの言語ではどういう呼称が使われてるんだろう。あ、英語圏で。なんでもかんでも～er ってこたないと思うのだけど。

スレッドを立てるまでもない質問雑談スレ42 

489 仕様書無しさん [sage] 2011/03/22(火) 13:43:24.37 ID: Be:
    Ruby使いはRubyist、Cobol使いはCOBOLer、じゃあC使いは何なの？ 

497 仕様書無しさん [] 2011/03/25(金) 21:55:15.17 ID: Be:


    ブチョーがコードレビューするのだが

    どーせ隅から隅まで見るわけねえだろ

    と思って

    一箇所にだけ unsigned int auto ってやったんだわ

    したらブチョーが「おい　このautoなんだ」

    見とったわ

    ブチョー隅まで見てたわ

    ビビッた 

498 仕様書無しさん [sage] 2011/03/26(土) 06:32:24.64 ID: Be:
    薄目で全体を見てるんだよキット 

499 仕様書無しさん [sage] 2011/03/27(日) 02:10:18.67 ID: Be:
    >>490
    　純粋なSE会社はコード書かないよ。客から話を聞だして文書作るのが仕事。
    コミュニケーションと技術的に可否の判断が必須、コーディングの知識なんてクソも役に立たんよ。
    某JRシ○テムはそうだった。気をつけて。

    >>497
    もうすぐC++じゃautoさんが生まれ変わっちゃうけどデフォルトでstaticな環境ってまだ存在すんのかな。

■_ Category Theory for the Java Programmer

ぼちぼちと。

Category Theory for the Java Programmer « reperiendi

Category Theory for the Java Programmer
(Javaプログラマーのための圏論)

(略)

1. Objects (オブジェクト)

Both in Java and in category theory, objects are members of a collection. In Java, it 
is the collection of values associated to a class. For example, the class Integer may 
take values from -2^{63} up to 2^{63}-1. The class

Java と圏論の両方でオブジェクトは collection のメンバーです。
Java においてはオブジェクトはあるクラスに結び付けられた値の collection です。
たとえば Integerクラスは -2^{63} から 2^{63}-1 までの値を取ります。

	enum Seasons { Spring, Summer, Fall, Winter }

has four possible values.
というクラスは四つの possible values を持っています。

In category theory, the collection of objects is “half” of a category. The other 
part of the category is a collection of processes, called “morphisms,” that go 
between values; morphisms are the possible ways to get from one value to another. The 
important thing about morphisms is that they can be composed: we can follow a process 
from an initial value to an intermediate value, and then follow a second process to a 
final value, and consider the two processes together as a single composite process.

圏論ではオブジェクトの collection は “half” of a category (圏の“半分”?) です。
The other part of the category is a collection of processes, called “morphisms,”
圏のもう半分は“morphisms”と呼ばれる value 間の process の collection です。
morphisms are the possible ways to get from one value to another.
morphisms で重要なことはそれが compose (集める) できるということです。
initial value から intermediate value への process 、そしてそこから
final value への second process を follow することが可能であり、
この二つの process をまとめてひとつの composite process とみなせるのです。

2. Categories (圏)

Formally, a category is
形式的には圏とは

    * a collection of objects, and
      オブジェクトの集まりと
    * for each pair of objects A, B, a set {\rm hom}(A, B) of morphisms between them
      オブジェクト A、B のペアごとの間にある morphisms  {\rm hom}(A, B) の集合です。

#にほんごおかしい○|￣|＿

such that

    * for each object X the set {\rm hom}(X, X) contains an identity morphism 1_X,

    * for each triple of objects X, Y, Z and pair of morphisms f \in {\rm hom}(X, Y) and
      g \in {\rm hom}(Y, Z), there is a composite morphism (g \circ f) \in {\rm hom}(X, Z),

    * for each pair of objects X, Y and morphism f\in {\rm hom}(X, Y), the identitiy
      morphisms are left and right units for composition: 1_Y \circ f = f = f \circ 1_X, and

    * for each 4-tuple of objects X, Y, Z, W and triple of morphisms f \in {\rm hom}(X, Y),
      g \in {\rm hom}(Y, Z), h \in {\rm hom}(Z, W) composition is associative:
      h\circ (g \circ f) = (h \circ g)\circ f.

If a morphism f \in {\rm hom}(A,B), category theorists write f:A\to B.
ある  morphism f が {\rm hom}(A,B) にあった場合、圏論ではそれを f:A\to B
のように書きます。

We can also define a category in Java. A category is any implementation of the 
following interface:

Javaでも圏を定義できます。
ある圏は以下のインターフェースの実装となります:

	interface Category {
	  interface Object {}
	  interface Morphism {}
	  class IllegalCompositionError extends Error;
	 
	  Object source(Morphism);
	  Object target(Morphism);
	  Morphism identity(Object);
	  Morphism compose(Morphism, Morphism)
	    throws IllegalCompositionError;
	};

that passes the following tests on all inputs:
このインターフェースはすべての入力に対して以下のテストを pass します:

	void testComposition(Morphism g, Morphism f) {
	  if (target(f) != source(g)) {
	    assertRaises(compose(g, f),
	                 IllegalCompositionError);
	  } else {
	    assertEqual(source(compose(g, f)), source(f));
	    assertEqual(target(compose(g, f)), target(g));
	  }
	}
	void testAssociativity(Morphism h,
	                       Morphism g,
	                       Morphism f) {
	  assertEqual(compose(h, compose(g, f)),
	              compose(compose(h, g), f));
	}
	void testIdentity(Object x) {
	  assertEqual(source(identity(x)), x);
	  assertEqual(target(identity(x)), x);
	}
	void testUnits(Morphism f) {
	  assertEqual(f, compose(identity(target(f)), f));
	  assertEqual(f, compose(f, identity(source(f))));
	}

One kind of category that programmers use every day is a monoid. Monoids are sets 
equipped with a multiplication table that's associative and has left- and right units. 
Monoids include everything that's typically called addition, multiplication, or 
concatenation. Adding integers, multiplying matrices, concatenating strings, and 
composing functions from a set to itself are all examples of monoids.

プログラマーが日々使っている圏の一種にモノイドがあります。
モノイドとは
sets equipped with a multiplication table that's associative and has left- and right units
です。
モノイドは典型的には加算、乗算、連結とよばれるようなものすべてを含みます。
a set to itself からの整数の加算、行列の乗算、文字列の連結、関数の合成
はすべてモノイドの例です。

In the language of category theory, a monoid is a one-object category (monos is Greek 
for “one”). The set of elements you can “multiply” is the set of morphisms from 
that object to itself.

圏論の用語では、あるモノイドは one-object category です( mono とはギリシア語での “one”)。
乗算可能な要素の集合は object to it self からなる morphisms の集合です。

In Java, a monoid is any implementation of the Category interface that also passes 
this test on all inputs:

Java ではモノイドはすべての入力について以下のテストも pass する
Categroy インターフェースの任意の実装です。

	void testOneObject(Morphism f, Object x) {
	  assertEqual(source(f), x);
	}

The testOneObject test says that given any morphism f and any object x, the source of 
the morphism has to be that object: given another object y, the test says that 
source(f) == x and source(f) == y, so x == y. Therefore, if the category passes this 
test, it has only one object.

testOneObject テストでは
与えられた morphism f と任意のオブジェクト x、その morphism の source が
そういったオブジェクトでなければならないとしています。
別のオブジェクト y が与えられたならば、このテストは
source(f) == x かつ source(f) == y でありしたがって x == y であると主張しています。
したがって、もしその圏がこのテストを pass したなら、その圏はただひとつだけの
オブジェクトを持っています。

For example, consider the monoid of XORing bits together, also known as addition 
modulo 2. Adding zero is the identity morphism for the unique object; we need a 
different morphism to add 1:

例としてaddition modulo 2 としても知られている
XORing bits together のモノイドを考えてみましょう。
ゼロを加えることは unique なオブジェクトに対する morphism の identity であり、
1を加えるには異なる morphism が必要です。

	interface Parity implements Category {
	  Morphism PlusOne();
	}

The existence of the PlusOne method says that there has to be a distinguished morphism, 
which could potentially be different from the identity morphism. The interface itself 
can't say how that morphism should behave, however. We need tests for that. The 
testPlusOne test says that (1 + 1) % 2 == 0. The testOneIsNotZero test makes sure that 
we don't just set 1 == 0: since (0 + 0) % 2 == 0, the first test isn't enough to catch 
this case.

PlusOne メソッドの存在は区別された morphism が存在してなければならないことを
主張しています。
その morphism は identity morphism とは異なる可能性があります。
インターフェースそれ自身はその morphism がどのように振舞うべきかについて言及していません。
ですからそのためのテストが必要です。
testPlusOne テストは (1 + 1) % 2 == 0 であることを主張しています。
testOneIsNotZero テストは 1 == 0 としないことを保証します。
(0 + 0) % 2 == 0 であるので、最初のテストはこのケースを catch するには不十分です。

	void testOnePlusOne(Object x) {
	  assertEqual(compose(PlusOne(), PlusOne()),
	              identity(x));
	}
	 
	void testOneIsNotZero(Object x) {
	  assertNotEqual(PlusOne(), identity(x));
	}

Here's one implementation of the Parity interface that passes all of the tests for 
Category, testOneObject, and the two Parity tests:

以下の例は Parity インターフェースの実装で、
圏、testOneObject に対するテストのすべてと
二つの Parity テストを渡します:

	class ParityImpl implements Parity {
	  enum ObjectImpl implements Object { N };
	  enum MorphismImpl implements Morphism {
	    PlusZero,
	    PlusOne
	  };
	  Object source(Morphism) { return N; }
	  Object target(Morphism) { return N; }
	  Morphism identity(Object) { return PlusZero; }
	  Morphism compose(Morphism f, Morphism g) {
	    if (f == PlusZero) return g;  // 0 + g = g
	    if (g == PlusZero) return f;  // f + 0 = f
	    return PlusZero;              // 1 + 1 = 0
	  }
	  Morphism PlusOne() { return PlusOne; }
	}

つづく。

わかったようなわからないような。

■_

げしょ。

2011年03月26日

■_

・ダムエー
ガンダム創世が最終回。これ、ガンダムさんの単行本に一緒に収録されてたのね。ガンダムさんは買ってなかったから気がつなかんだ。オリジンはまあそのなんだ。

を見かけたけど、前のとの違いはとりあえずグラフ理論の章が増えたことみたい (分量的にはそれほどでも)。スルーでいいかな。一緒に並んでいた

は、入門には割といいんじゃなかろうかという感じ。全ページカラーで図解いりだし。

■_ Javaプログラマーのための圏論

良くみたら2007年11月の記事か。古いとダメということはないけど今これが再発見されたんだろうか

Category Theory for the Java Programmer « reperiendi

Category Theory for the Java Programmer
(Javaプログラマーのための圏論)

Posted in Category theory, Math, Programming by reperiendi on 2007 November 3

There are several good introductions to category theory, each written for a different 
audience. However, I have never seen one aimed at someone trained as a programmer 
rather than as a computer scientist or as a mathematician. There are programming 
languages that have been designed with category theory in mind, such as Haskell, OCaml, 
and others; however, they are not typically taught in undergraduate programming 
courses. Java, on the other hand, is often used as an introductory language; while it 
was not designed with category theory in mind, there is a lot of category theory that 
passes over directly.

いくつかの優れた、圏論への introduction があって、それぞれ異なる読者に向けて
書かれています。しかし、わたしはこれまでコンピューター科学者でも数学者でもない、
プログラマーに向けて書かれたものを見たことがありません。Haskell や OCmal
のように圏論に関係深く設計されている言語はいくつかありますが、そういった
言語は大学のプログラミングコースで教えられるようなものであるのが典型です。
一方で、Java は introductory language として頻繁に使われています。
Java は圏論を念頭において設計された言語ではありませんが、


I'll start with a sentence that says exactly what the relation is of category theory 
to Java programming; however, it's loaded with category theory jargon, so I'll need to 
explain each part.

A collection of Java interfaces is the free cartesian category with equalizers on 
the interface objects1 and the built-in objects.

1. Objects

以下略

■_ Language of the Month: Mirah

Dr Dobbs ジャーナルには今月の言語 (Language of the Month) なんてのがあるのか。

Dr Dobbs | Language of the Month: Mirah

What happens when you take Ruby syntax, add static typing, and run it on the JVM?
Charles Nutter, the designer of JRuby, shows us

Ruby の構文を持ってきて、静的な型付けを加えてそれをJVM上で実行したら
何がおきるでしょうか? JRuby の desinger である Charles Nutter はわたしたちに
それを見せてくれました。

The JVM is an interesting place these days. We'll have Java 7 soon, with support for
dynamic invocation and improved APIs. We have a many programming languages to choose
from — some statically typed, some dynamically typed, some blurring the lines — that
offer their own unique benefits. We're even getting a few "small changes" to
Java itself, like literal lists and Strings-in-switch. We have the largest and most
robust ecosystem of any managed runtime, with hundreds of conferences, tens of
thousands of libraries, and millions of developers.

今日において JVM は interesting な場所です。
まもなくわたしたちは dynamic invoation と改良されたAPI をサポートする
Java 7 を手にします。わたしたちにはそれぞれが選択肢として unique な利点を offer する
多くのプログラミング言語があります。
そういった選択肢には静的型付けや動的型付け、blurring the lines といったものがあります。
Java それ自身にはリテラルリストや文字列による switch のような
少数の“小さな変更”しかありません。
わたしたちには何百という conferences を持ち、数千のライブラリを有し、そして
何百万という開発者を持った最も大きく最も頑強なmanaged runtime のエコシステムがあります。

But there's something missing. No language seems likely to replace Java itself.

しかし何かが足りないのです。
Javaそのものを置き換えてしまう言語はないように思われます。

Or at least, none of the alternatives fits what I need out of a Java replacement.

少なくとも、Javaを置き換えるのにわたしが必要だと思っていることに fit する
alternatives はありません。

Learning from Java
Java から学ぶ

(略)

The Mirah Programming Language

Mirah grew out of my desire to have a language for implementing JRuby that would be
more approachable to non-Java developers. Let's face it, Java's not too hard to learn,
but there are lots of details that take time to internalize. It's not a complicated
language, but it can scare the uninitiated. With Mirah, I wanted to make a language
that fit my criteria, so that I and others would have the Java replacement we've been
wanting.

(略)

■_ Java は死なず

ちょっと前にあったやつに対する反論らしい＞ Rise and Fall and Rise

Not Dead Yet: The Rise and Fall and Rise of Java – tecosystems

Not Dead Yet: The Rise and Fall and Rise of Java

FOSDEM

In November of last year, Forrester analyst Mike Gualtieri published the provocatively 
titled “Java Is A Dead-End For Enterprise App Development.” This January, the firm's
John Rymer followed up with a more balanced but similarly pessimistic “The Future 
of Java.” Collectively, these may be considered representative of the conventional 
wisdom of the enterprise.

昨年十一月、Forrester のアナリスト Mike Gualtieri は
“Java Is A Dead-End For Enterprise App Development”
という provcatively なタイトルのついたものを公開しました。
今年一月にはその firm の John Rvmer が、よりバランスの取れたしかしやはり悲観的な
フォローアップ記事 “The Future of Java” を公開しました。


The reality that we at RedMonk observe differs substantially, as might be predicted 
given our sources. What you think depends on who you listen to, remember. Forrester, 
as we have discussed previously [coverage], primarily concerns itself with enterprise 
software buyers and “platform software decision-makers.” RedMonk, meanwhile, is 
oriented around practitioners; practitioners like those attending FOSDEM, for example. 
We advantage this audience simply because we believe that bottom up adoption is more 
predictive of technology direction than top down procurement, but reasonable minds may 
obviously disagree.
(以下略)

誰か全文を(ry

■_

Reviewing Perl Best Practices, Chapter 15, Objects - House Absolute(ly Pointless)

Reviewing Perl Best Practices, Chapter 15, Objects

By Dave Rolsky on March 23, 2011 10:40 AM

Perl Best Practices (PBP) was released in 2004, about 7 years ago. Perl is a great
language, but its culture of TIMTOWTDI can be a problem. These days, we often hear
that "there's more than one way to do it, but sometimes consistency is not a bad
thing either" (TIMTOWTDIBSCINABTE). PBP deserves a lot of credit for encouraging
discussion about the downsides of TIMTOWTDI.

Perl Best Practices (PBP) はほぼ7年前、2004年にリリースされたものです。
Perl は great な言語ではありますが、TIMTOWTDI の文化は問題となる可能性を
孕んでいます。今日、わたしたちは
"there's more than one way to do it, but sometimes consistency is not a bad thing either"
(TIMTOWTDIBSCINABTE).
ということを頻繁に耳にします。
PBP は TIMTOWTDI の downsides についての

PBP has a lot of recommendations. A lot of them are fairly trivial, like
recommendations on code formatting. I'm a big fan of using a consistent style
throughout a single code base, but I'm flexible about what that style should be. Some
of his other recommendations are more meaningful, and deserve some review 7 years
later.

PBP は多くの recommendations があります。
その多くは コードのフォーマットの推奨のように trivial なものです。
わたしは単一のコードベースを通じて一貫したスタイルを使っていることがとても好きです。
とはいえ、わたしはあるべきスタイルについては flexible です。
彼のそのほかの recommendations の一部はより meaningful なものであり、
7年たった今でもいくつかレビューを受けています。

I'll take a look at each recommendation in Chapter 15, Objects, and see how it's stood
up in the era of Modern Perl. Maybe I'll tackle some other chapters in the future.

(略)

Conclusion

This chapter of PBP holds up pretty well 7 years later. He was out of sync with the
community on accessor naming in 2004, and that hasn't changed. I disagree with him on
overloading, but I don't know that there's any community agreement on that.

Other chapters talk more about inside-out objects, and I think he missed the mark in
thinking those were the future of Perl OO. I'll cover that in more detail if I review
one of those chapters.

His remaining recommendations remain pretty solid in 2011.

■_

【連載】鉄道トリビア (92) 震災後、燃料不足の解消に尽力したタンク貨車 | ライフ | マイコミジャーナル
Hexenkessel, 震災後の停電はなく、水の備蓄もあったため、女川原発は津波で壊滅的被害を受けた女川町民の救いの場所になっ...
日本製紙の工場と、周辺の被災状況 - 石巻・女川の震災被害
師匠が欲しい - IT系で覚醒めたい
翻訳家・山形浩生氏が2chの能登麻美子スレにまさかの降臨世界一かわいいよ！
http://www.kotobukiya.co.jp/kotobukiyashop/img0/qbei_pos.jpg
カッコがイヤなら消せばいいじゃない - IT系で覚醒めたい
Togetter - 「真夜中のSTMトーク」
http://panathenaia.halfmoon.jp/alang/ada/langupdate2012.pdf
IRB　それはRubyistの魔法のランプ - hp12c
変数、リファレンス
pythonのzipfileモジュールがだめ文字をうまく扱ってくれないのを解決する - yattの日記
お手軽に風変わりな文字体に変換できるジェネレータ-篆書、トンパほか | 適宜覚書-Fragments
YouTube - When Computers Were Human
どこでPerl 6のために/サンプルコードのデモをして下さいか？
ホモトピー・ラムダ計算とホモトピー型理論 - 檜山正幸のキマイラ飼育記
計算科学やソフトウェア工学で次に流行るもの、それはホモトピーでしょう。
A piece of the Rosetta stone « reperiendi

2011年03月25日

■_

買った

あとがきが重いっすよ… ○|￣|＿

モーニング掲載時から改変されたエピソードがあるとか→ 【乗客専務】カレチ【車掌】

・アフタヌーン
主役が登場したのどれだけぶりだろう＞むげにん

・ダムエー
26日発売掲載ので最終回? ＞オリジン

■_ C/C++プログラミングの「迷信」と「誤解」

フラゲして買ってたんですが、いろいろ割り込みがあったりなかったり。正直重箱の隅過ぎないかというのもあったりしましたが、読んでて面白かったです。一番のお気に入り(?) は「scanfはバッファオーバーランを防げない?」かな。意地悪な入力パターンを考えると結構手間なんですよね。＞ fgets + sscanf とか

でもこの本で一番いいなと思ったのは、ところどころ挟まるこういう漫画や挿絵だったりして (高木さんごめんなさい)。

■_ Exploring Binary

以前に読んだことのあるものを含めまだざっとしか眺めてないのだけど、結構いろいろ書いてて面白そう。

■_ 便利だけどあまり知られていないデータ構造

回答編。リンクまではむり ○|￣|＿

知らないのが結構あるなあ。以下適当に抜粋

language agnostic - What are the lesser known but cool data structures ? - Stack Overflow

There are some data structures around that are really cool but are unknown to most 
programmers. Which ones are they?

Everybody knows about linked lists, binary trees, and hashes, but what about Skip 
lists and Bloom filters for example. I would like to know more data structures that 
are not so common, but are worth knowing because they rely on great ideas and enrich a 
programmer's tool box.

PS: I am also interested in techniques like Dancing links which make clever use of 
properties of a common data structure.

EDIT: Please try to include links to pages describing the data structures in more detail. Also, try to add a couple of words on why a data structures is cool (as Jonas Kölker already pointed out). Also, try to provide one data-structure per answer. This will allow the better data structures to float to the top based on their votes alone.
language-agnostic data-structures computer-science self-improvement

Rope: It's a string that allows for cheap prepends, substrings, middle insertions and 
appends. I've really only had use for it once, but no other structure would have 
sufficed. Regular strings and arrays prepends were just far too expensive for what we 
needed to do, and reversing everthing was out of the question.

Here are a few:

* Suffix tries. Useful for almost all kinds of string searching
(http://en.wikipedia.org/wiki/Suffix_trie#Functionality). See also suffix arrays;
they're not quite as fast as suffix trees, but a whole lot smaller.

* Splay trees (as mentioned above). The reason they are cool is threefold:
o They are small: you only need the left and right pointers like you do in
any binary tree (no node-color or size information needs to be stored)
o They are (comparatively) very easy to implement
o They offer optimal amortized complexity for a whole host of "measurement criteria"
(log n lookup time being the one everybody knows).
See http://en.wikipedia.org/wiki/Splay_tree#Performance_theorems

* Heap-ordered search trees: you store a bunch of (key, prio) pairs in a tree,
such that it's a search tree with respect to the keys, and heap-ordered with respect
to the priorities. One can show that such a tree has a unique shape (and it's not
always fully packed up-and-to-the-left). With random priorities, it gives you expected
O(log n) search time, IIRC.

* A niche one is adjacency lists for undirected planar graphs with O(1) neighbour
queries. This is not so much a data structure as a particular way to organize an
existing data structure. Here's how you do it: every planar graph has a node with
degree at most 6. Pick such a node, put its neighbors in its neighbor list, remove it
from the graph, and recurse until the graph is empty. When given a pair (u, v), look
for u in v's neighbor list and for v in u's neighbor list. Both have size at most 6,
so this is O(1).

By the above algorithm, if u and v are neighbors, you won't have both u in v's list and v in u's list. If you need this, just add each node's missing neighbors to that node's neighbor list, but store how much of the neighbor list you need to look through for fast lookup.

I'm surprised no one has mentioned Merkle trees (ie. Hash Trees).

Used in many cases (P2P programs, digital signatures) where you want to verify the 
hash of a whole file when you only have part of the file available to you.

    * Kd-Trees, spatial data structure used (amongst others) in Real-Time Raytracing, 
      has the downside that triangles that cross intersect the different spaces need to
      be clipped. Generally BVH's are faster because they are more lightweight.

    * MX-CIF Quadtrees, store bounding boxes instead of arbitrary point sets by    
      combining a regular quadtree with a binary tree on the edges of the quads.

    * HAMT, hierarchical hash map with access times that generally exceed O(1) 
      hash-maps due to the constants involved.

    * Inverted Index, quite well known in the search-engine circles, because it's used 
      for fast retrieval of documents associated with different search-terms.
 
Most, if not all, of these are documented on the NIST Dictionary of Algorithms and 
Data Structures

Work Stealing Queue

Lock-free data structure for dividing the work equaly among multiple threads 
http://stackoverflow.com/questions/2101789/implementation-of-a-work-stealing-queue-in-
c-c

Left Leaning Red-Black Trees. A significantly simplified implementation of red-black 
trees by Robert Sedgewick published in 2008 (~half the lines of code to implement). If 
you've ever had trouble wrapping your head around the implementation of a Red-Black 
tree, read about this variant.

Very similar (if not identical) to Andersson Trees.

Fenwick Tree. It's a data structure to keep count of the sum of all elements in a 
vector, between two given subindexes i and j. The trivial solution, precalculating the 
sum since the begining doesn't allow to update a item (you have to do O(n) work to 
keep up).

Fenwick Trees allow you to update and query in O(log n), and how it works is really 
cool and simple. It's really well explained in Fenwick's original paper, freely 
available here:

http://www.cs.ubc.ca/local/reading/proceedings/spe91-95/spe/vol24/issue3/spe884.pdf

Its father, the RQM tree is also very cool: It allows you to keep info about the 
minimum element between two indexes of the vector, and it also works in O(log n) 
update and query. I like to teach first the RQM and then the Fenwick Tree.

Pairing heaps are a type of heap data structure with relatively simple implementation 
and excellent practical amortized performance.

Ball Trees. Just because they make people giggle.

A ball tree is a data structure that indexes points in a metric space. Here's an 
article on building them. They are often used for finding nearest neighbors to a point 
or accelerating k-means.

language agnostic - What are the lesser known but cool data structures ? - Stack Overflow

Enhanced hashing algorithms are quite interesting. Linear hashing is neat, because it 
allows splitting one "bucket" in your hash table at a time, rather than 
rehashing the entire table. This is especially useful for distributed caches. However, 
with most simple splitting policies, you end up splitting all buckets in quick 
succession, and the load factor of the table oscillates pretty badly.

I think that spiral hashing is really neat too. Like linear hashing, one bucket at a 
time is split, and a little less than half of the records in the bucket are put into 
the same new bucket. It's very clean and fast. However, it can be inefficient if each 
"bucket" is hosted by a machine with similar specs. To utilize the hardware 
fully, you want a mix of less- and more-powerful machines.


Splash Tables are great. They're like a normal hash table, except they guarantee 
constant-time lookup and can handle 90% utilization without losing performance. 
They're a generalization of the Cuckoo Hash (also a great data structure). They do 
appear to be patented, but as with most pure software patents I wouldn't worry too 
much.

2-3 Finger Trees by Hinze and Paterson are a great functional data structure 
swiss-army knife with great asymptotics for a wide range of operations. While complex, 
they are much simpler than the imperative structures by Persistent lists with 
catenation via recursive slow-down by Kaplan and Tarjan that preceded them.

They work as a catenable deque with O(1) access to either end, O(log min(n,m)) append, 
and provide O(log min(n,length - n)) indexing with direct access to a monoidal prefix 
sum over any portion of the sequence.

Implementations exist in Haskell, Coq, F#, Scala, Java, C, Clojure, C# and other 
languages.

You can use them to implement priority search queues, interval maps, ropes with fast 
head access, maps, sets, catenable sequences or pretty much any structure where you 
can phrase it as collecting a monoidal result over a quickly catenable/indexable 
sequence.

I also have some slides describing their derivation and use.

Fast Compact tries:

    * Judy arrays: Very fast and memory efficient ordered sparse dynamic arrays for 
      bits, integers and strings. Judy arrays are faster and more memory efficient than
      any binary-search-tree.
    * HAT-trie: A Cache-conscious Trie-based Data Structure for Strings
    * B-tries for Disk-based String Management

Splay Trees are cool. They reorder themselves in a way that moves the most often 
queried elements closer to the root.

BK-Trees, or Burkhard-Keller Trees are a tree-based data structure which can be used 
to quickly find near-matches to a string.

I sometimes use Inversion LIsts to store ranges, and they are often used to store 
character classes in regular expressions. See for example 
http://www.ibm.com/developerworks/linux/library/l-cpinv.html

Another nice use case is for weighted random decisions. Suppose you have a list of 
symbols and associated probabilites, and you want to pick them at random according to 
these probabilities

   a => 0.1
   b => 0.5
   c => 0.4

Then you do a running sum of all the probabilities:

  (0.1, 0.6, 1.0)

This is your inversion list. You generate a random number between 0 and 1, and find 
the index of the next higher entry in the list. You can do that with a binary search, 
because it's sorted. Once you've got the index, you can look up the symbol in the 
original list.

If you have n symbols, you have O(n) preparation time, and then O(log(n)) acess time 
for each randomly chosen symbol - independently of the distribution of weights.

A variation of inversion lists uses negative numbers to indicate the endpoint of 
ranges, which makes it easy to count how many ranges overlap at a certain point. See 
http://www.perlmonks.org/index.pl?node_id=841368 for an example.

Bucket Brigade

They are used extensively in Apache. Basically they are a linked list that loops 
around on itself in a ring. I am not sure if they are used outside of Apache and 
Apache modules but they fit the bill as a cool yet lesser known data structure. A 
bucket is a container for some arbitrary data and a bucket brigade is a collection of 
buckets. The idea is that you want to be able to modify and insert data at any point 
in the structure.

Lets say that you have a bucket brigade that contains an html document with one 
character per bucket. You want to convert all the < and > symbols into &lt; 
and &gt; entities. The bucket brigade allows you to insert some extra buckets in 
the brigade when you come across a < or > symbol in order to fit the extra 
characters required for the entity. Because the bucket brigade is in a ring you can 
insert backwards or forwards. This is much easier to do (in C) than using a simple 
buffer.

Some reference on bucket brigades below:

Apache Bucket Brigade Reference

Introduction to Buckets and Brigades

Zippers

A zipper is a concept that can be used with trees, or lists, and is extremely useful 
in functional programming. The idea is to store a linked data structure with a 
"context," and then use the context to move up and down the linked data 
structure when necessary.

See http://www.haskell.org/haskellwiki/Zipper for a much better description.


Fenwick trees (or binary indexed trees) are a worthy addition to ones toolkit. If you 
have an array of counters and you need to constantly update them while querying for 
cumulative counts (as in PPM compression), Fenwick trees will do all operations in 
O(log n) time and require no extra space. See also this topcoder tutorial for a good 
introduction.


I really really love Interval Trees. They allow you to take a bunch of intervals (ie 
start/end times, or whatever) and query for which intervals contain a given time, or 
which intervals were "active" during a given period. Querying can be done in 
O(log n) and pre-processing is O(n log n).

language agnostic - What are the lesser known but cool data structures ? - Stack Overflow


PQ-Trees

Someone else already proposed Burkhard-Keller-Trees, but I thought I might mention 
them again in order to plug my own implementation. :)

http://well-adjusted.de/mspace.py/index.html

There are faster implementations around (see ActiveState's Python recipes or 
implementations in other languages), but I think/hope my code helps to understand 
these data structures.

By the way, both BK and VP trees can be used for much more than searching for similar 
strings. You can do similarity searches for arbitrary objects as long as you have a 
distance function that satisfies a few conditions (positivity, symmetry, triangle 
inequality).

Arne Andersson trees are a simpler alternative to red-black trees, in which only right 
links can be red. This greatly simplifies maintenance, while keeping performance on 
par with red-black trees. The original paper gives a nice and short implementation for 
insertion and deletion.


I think the FM-index by Paolo Ferragina and Giovanni Manzini is really cool. 
Especially in bioinformatics. It's essentially a compressed full text index that 
utilizes a combination of a suffix array and a burrows-wheeler transform of the 
reference text. The index can be searched without decompressing the whole index.

Ternary Search Tree

    * Quick prefix search (for incremental autocomplete,etc)
    * Partial Matching (When you want to find all words within X hamming distance of a string)
    * Wildcard Searches

Quite Easy to implement.

■_

Understanding the Python GIL (#82) - PyCon 2010 Atlanta - A Conference for the Python Community

■_ エクササイズ重要

以前、二分探索を書かせてもきちんとそれを書けるプログラマーというのは少ないという話題を紹介しましたが、その続き。

Common bugs and why exercises matter (binary search part 2) | The Reinvigorated Programmer

Common bugs and why exercises matter (binary search part 2)

Posted on April 21, 2010 by Mike Taylor| 62 Comments

Many, many thanks to all of you who have contributed to the astonishing response to 
the previous article on the difficulty of getting binary search right.  I've never 
seen anything like this: in just over 24 hours since I posted the article, 541 
comments have been posted, and they're still rolling in.  (That number is a bit 
inflated by people re-posting the same code-samples multiple times after seeing 
WordPress mangle their source, but even taking that into account it still amazes me.)

You guys rule.  Seriously.

I have a lot to say in response to it all.  Read on …

(さくっと略)

Anyway, even allowing the various biasing factors, it seems possible — maybe even 
likely — that our hit rate is better than 10%.

Common bugs
(よく見受けられたバグ)

It's been interesting to see that many of the programs that had bugs were going wrong 
in the same ways.  Common bugs included:

    * Not dealing correctly with a zero-element array
      要素がゼロ個の配列を正しく扱っていない

    * Not dealing correctly with a one-element array
      要素が一個の配列を正しく扱っていない

    * Not finding a value in the last element of the array
      配列の末尾の要素にある値を見つけられない

    * Not dealing with failed searches (the sought element is not there)
      検索に失敗した(検索対象が存在していない)場合に対処していない

    * Not coping with repeated elements in the array
      配列中の要素に繰り返しがあったときの対処がない

    * Not omitting the midpoint from the subranges when narrowing the range after a 
      probe.  (This  results in an infinite loop for some inputs.)
      probe のあとで範囲を狭めるときに部分範囲から midpoint を除去していない
      (一部の入力に対して無限ループになってしまう可能性がある)。

    * Pointing to the first and last members of the range but coding as though the 
l     ast-pointer variable pointed to the element after the range (or vice versa).
      対象となる範囲の先頭や末尾を point するが、その先が範囲から外れたものを
      指しているコーディングをしている

(And, yes, not dealing with a zero-element array is a bug, for those commenters who 
asked.  Zero is a perfectly good number of elements for an array to have, and an array 
search that fails for empty arrays is no more acceptable than an addition operator 
that fails if you add zero.  A few commenters also asked whether it's OK to ignore 
the case when a null Array reference is passed in: yes, of course — that would violate 
the precondition that the search is in a pre-sorted array, so all bets are off.  No 
array at all is completely different from an empty array.)

The interesting thing about these common failure modes is that they are all so predictable.
If, before starting to code, we'd each taken two minutes to write down the ways in which
we could imagine our programs failing, I bet we'd all have listed most if not all of
these.  Yet even knowing that, we find them hard to avoid.

In light of the list of common bugs, we can easily see what kinds of test cases we 
need to include in the suite: empty arrays, single-element arrays, arrays where the 
sought element is absent, or appears only at the end, or only at the start, or 
multiple times; and of course a healthy does of randomly generated sorted arrays.

Looking at the list of bugs, I find myself thinking of the rules in Kernighan and Pike'
s Elements of Programming Style.  Back when I reviewed that book, and listed the rules 
that are gathered at the end of the article, some commenters felt that they were so 
obvious, so trivial, as to be worthless or even insulting.  And yet most of the 
mistakes listed above are covered in K&P's list.  A few of the relevant rules are:

    * “Write clearly — don't be too clever.”
       明確わかりやすくに書こう。小賢しくならず。

    * “Make sure your code “does nothing” gracefully.”

    * “Watch out for off-by-one errors.”
       一個外れのエラーには注意しよう。

    * “Take care to branch the right way on equality.”
       正しいやり方で等価性を判定して注意深く分岐しよう

    * “Make sure special cases are truly special.”
       特殊なケースは本当に特殊なものにしよう。

    * “Choose a data representation that makes your program simple.”
       プログラムをシンプルにするデータの表現を選択しよう

#書法ひいたほうが良かったかな

Then there is the rule that the I didn't let you follow literally, but the spirit of 
which should pervade our design process: “Test programs at their boundary values”.  
And the very general but very relevant rule “Don't stop with your first draft.”  
And of course I must mention my favourite:

    “Say what you mean, simply and directly.”

Why does this exercise matter?
以下略

このプログラム書法、オススメではあるのだけどいかんせん年代ものなので、 ↓のほうが読むには良いかも

■_

2011年03月24日

■_

時間が足りない感が

「構造体渡し」って聞かないよなあ "構造体渡し" - Google 検索

■_

Why “Volatile” Fixes the 2.2250738585072011e-308 Bug - Exploring Binary

Why “Volatile” Fixes the 2.2250738585072011e-308 Bug
(なぜ“Volatile”で 2.2250738585072011e-308 Bug を修正できたのか)

By Rick Regan (Published January 26th, 2011)

Copyright c 2008-2011 Exploring Binary

http://www.exploringbinary.com/why-volatile-fixes-the-2-2250738585072011e-308-bug/

Recently I discovered a serious bug in x87 builds of PHP: PHP's decimal to 
floating-point conversion routine, zend_strtod(), went into an infinite loop when 
converting the decimal string 2.2250738585072011e-308 to double-precision binary 
floating-point. This problem was fixed with a simple one line of code change to 
zend_strtod.c:

Recently I discovered a serious bug in x87 builds of PHP: 
decimal から floating-point へ変換するPHP の ルーチン zend_strtod() は
decimal 文字列 2.2250738585072011e-308 を倍精度二進浮動小数点数に
変換するときに無限ループに落ち込んでいました。
この問題は zend_strtodd.c に対する単純な一行の変更によって解決されました:

This line
この行が

double aadj, aadj1, adj;

was changed to
このように変更されました

volatile double aadj, aadj1, adj;

Why does this fix the problem? I uncovered the very specific reason: it prevents a 
double rounding on underflow error.

なぜこれが問題を解決したのでしょうか?
I uncovered the very specific reason: 
it prevents a double rounding on underflow error.


The Failing Calculation in zend_strtod()
zend_strtod() における Failing Calculation

For input “2.2250738585072011e-308”, zend_strtod() is supposed to return 0x0.fffffffffffffp-1022, 
or in normalized binary scientific notation,

“2.2250738585072011e-308”という入力に対し、 zend_strtod() は
 0x0.fffffffffffffp-1022 もしくは正規化された二進科学表記 (normalized binary scientific notation)
 で

1.111111111111111111111111111111111111111111111111111 x 2-1023.

を返します。

This is a subnormal number, 2-1074 below DBL_MIN, which is 2-1022.
これは非正規化数(subnormal number) で、


Instead, zend_strtod() goes into an infinite loop. This section of code, appearing inside a loop, is responsible for the problem:

ところが zend_strtod() は無限ループへと落ち込みます。
ループの内側にあるコードのこのセクションは rsponsible for the problem のように見えます。

/* Compute adj so that the IEEE rounding rules will
 * correctly round rv + adj in some half-way cases.
 * If rv * ulp(rv) is denormalized (i.e.,
 * y <= (P-1)*Exp_msk1), we must adjust aadj to avoid
 * trouble from bits lost to denormalization;
 * example: 1.2e-307 .
 */
if (y <= (P-1)*Exp_msk1 && aadj >；= 1.) {
	aadj1 = (double)(int)(aadj + 0.5);
	if (!dsign)
		aadj1 = -aadj1;
}
adj = aadj1 * ulp(value(rv));
value(rv) += adj;

The two lines I've highlighted adjust the double-precision converted result, value(rv), 
so that it converges to the correctly rounded result. Although the source code 
expression of the calculation is correct, it fails when implemented in 
extended-precision arithmetic ? which is what happens when the variable adj is not 
declared volatile.

zend_strtod() Executes In Double-Precision Mode
zend_strtod() は倍精度モードで実行されている

zend_strtod() executes in double-precision mode, so this is not your run-of-the-mill 
double-precision/extended-precision mismatch. So how does extended-precision come into 
play? Extended-precision numbers in the double-precision subnormal exponent range 
(-1023 through -1074) have 53 bits of precision, not the usual 1 to 52 bits. As I'll 
show, this leads to a special case of double rounding error ? a double rounding on 
underflow error. The result is that value(rv) remains unchanged ? and zend_strtod() 
goes into an infinite loop!

Volatile Almost Forces Double-Precision Semantics
volatile は double-precision semantics をほぼ強制する

Floating-point calculations done with x87 FPU instructions are done in extended-precision 
registers, even when the processor is in double-precision mode and the variables being 
operated on are double-precision. This means that true double-precision semantics can't be 
emulated. Why? Because extended-precision values have a wider exponent range. This can almost 
be fixed by declaring double variables ‘volatile'.

x87 FPU命令を使った浮動小数点数演算は、プロセッサーが倍精度モードで変数も倍精度のもの
であったとしても拡張精度を持ったレジスターで行われます。これは本当の倍精度実数の
セマンティクスがエミュレートできないことを意味します。なぜでしょうか?
なぜなら拡張精度の値はより広い範囲の指数部を持っているからです。
これは、double の変数を 'volatie' として宣言してやることでほとんど解決されます。

The ‘volatile' keyword forces the compiler to store a variable in memory each time it's 
written, and to load it from memory each time it's read. This ensures that the exponent is 
constrained to the double-precision range; a number that is too large for double-precision will 
overflow, and a number that is too small will underflow. So far, so good.

'volatile'キーワードは、コンパイラーに対して変数に書き込みするたびにメモリーに格納
するように強制し、読み出しのたびにメモリーからロードするように強制するものです。
これは、指数部が倍精度の範囲に収まることを保証します。倍精度実数には大きすぎる
数値はオーバーフローを起こし、小さすぎる数値はアンダーフローを起こします。

But there remains one problem, in the handling of underflow: subnormal numbers will be doubly 
rounded. In double-precision mode, extended-precision results are rounded to 53 bits. If the 
result is subnormal relative to double-precision, it will be rounded again when written to 
memory ― to between 1 and 52 significant bits, the range of precision for a subnormal 
double-precision number.

しかし問題がまだ一つ残っています。それはアンダーフローの取り扱いです:非正規化数は二重
丸めされてしまいます。倍精度モードでは、拡張精度の結果は53ビットに丸められます。ここで
もしその丸めの結果が倍精度において subnornal reative であった場合、それをメモリーに書
き戻す際に再度丸めが行われて非正規化倍精度数の range of precision を1ビットから52ビッ
トのsignificant bits に収まるように


That said, the ‘volatile' keyword does fix the PHP problem at hand. It doesn't eliminate 
double rounding, but it does eliminate the specific double rounding error that causes the 
problem. (Does this mean there are other double rounding errors lurking? If so, hopefully the 
result is just an incorrect conversion, and not another infinite loop.)

'volatile' キーワードは PHP の問題を at hand で解決するのですが、二重丸めを
ekiminate することはありません。ただし、問題を引き起こすような特定の倍精度丸めのエラーを
eliminate します。


The Generated Code Without ‘Volatile'
'Volatile'なしで生成されたコード

I took a fixed version of PHP ― PHP 5.3.5 ― and removed the ‘volatile' keyword to restore 
it to its pre-fixed state. I built it on an Ubuntu Linux system and traced it using the gdb 
debugger. Without the ‘volatile' keyword, the two highlighted lines of source code compile 
to these instructions:

0x8302a4a  call 0x8301290 <ulp>
           ...
           0x83012b9  fldl -0x10(%ebp)    ; load ulp(value(rv))
           0x83012bc  leave
           0x83012bd  ret
           ...
0x8302a4f  mov -0x8c(%ebp),%eax
0x8302a55  fldl -0x30(%ebp)         ; load value(rv)
0x8302a58  fldl -0x98(%ebp)         ; load aadj1
0x8302a5e  fmulp %st,%st(2)         ; adj = aadj1 * ulp(value(rv))
0x8302a60  faddp %st,%st(1)         ; value(rv) += adj
0x8302a62  fstpl -0x30(%ebp)        ; store value(rv)

The problem occurs because the result of the multiplication, adj, is left on the FPU register 
stack, where it is then added to value(rv). This leads to a double rounding on underflow 
error, which I'll explain below.

乗算の結果 adj がFPU レジスタースタック上に置かれたままで、続いて value(rv) と
加算を行うために問題は発生します。これはアンダーフローエラー時の二重丸め
(double rounding on underflow error) に繋がります。

The result of the fmulp instruction (adj)

Before the fmulp instruction executes, the value of its two operands ― aadj1 and 
ulp(value(rv)) ―  are on the stack (not shown). This is the FPU stack after the fmulp 
instruction at 0x8302a5e executes (I formatted gdb's output for display):

fmulp 命令の実行前、その二つのオペランド aadj1 と ulp(value(rv)) はスタック上にあります。
以下に示すのは 0x8302a5e で fmulp 命令を実行したあとのFPUスタックです
(gdb の画面出力を整形しました): 

st0  2.2250738585072013830902327173324041e-308  (raw 0x3c018000000000000000)
st1  -2.8309023271733244206437071197035129e-324 (raw 0xbbcc92aee22acdf2d000)
st2  0	                                        (raw 0x00000000000000000000)
st3  0	                                        (raw 0x00000000000000000000)
st4  0                                          (raw 0x00000000000000000000)
st5  0	                                        (raw 0x00000000000000000000)
st6  0	                                        (raw 0x00000000000000000000)
st7  -0.57298100991279032889735844946699217	(raw 0xbffe92aee22acdf2d000)

The result of the multiplication, adj, is in st1: 0xbbcc92aee22acdf2d000. That hexadecimal 
value represents an extended-precision floating-point number, which has three fields: a 1 bit
sign field, a 15 bit excess-16383 exponent field, and a 64 bit fraction field (there is no 
implicit leading 1 bit in extended-precision). Translated into binary and separated into 
fields, 0xbbcc92aee22acdf2d000 is

乗算の結果である adj は st1 にありその内容は 0xbbcc92aee22acdf2d000 です。
この十六進値は拡張浮動小数点数を表現していて、三つのフィールドに分かれます。

二進で表し、またフィールドごとに分割すると、0xbbcc92aee22acdf2d000 は次のようになります

1
011101111001100
1001001010101110111000100010101011001101111100101101000000000000

The exponent field 011101111001100 is 15308 in decimal, and after subtracting 16383, 
represents an exponent of -1075. So the number this floating-point value represents is

指数部の 011101111001100 は十進表記すると 15308 で、

-1.001001010101110111000100010101011001101111100101101 x 2-1075.

The Result of the faddp Instruction (value(rv))

Before the faddp instruction executes, the stack contains its two operands: the initial value 
of value(rv) in st0 (loaded with fldl instruction at address 0x8302a55) and adj in 
st1.\^ value(rv) is 0x3c018000000000000000:

faddp の実行前、スタックには二つのオペランドがあります:
st0 にある value(rv) の初期値と
st1 にある adj です


0
011110000000001
1000000000000000000000000000000000000000000000000000000000000000.

This represents 2-1022.

This is the FPU stack after the faddp instruction at 0x8302a60 executes:
faddp 命令が 0x8302a60 で実行されたあとのFPUスタックはこうなります:

st0  2.225073858507201136057409796709132e-308   (raw 0x3c00fffffffffffff800)
st1  0	                                        (raw 0x00000000000000000000)
st2  0	                                        (raw 0x00000000000000000000)
st3  0	                                        (raw 0x00000000000000000000)
st4  0                                          (raw 0x00000000000000000000)
st5  0	                                        (raw 0x00000000000000000000)
st6  -0.57298100991279032889735844946699217	(raw 0xbffe92aee22acdf2d000)
st7  2.2250738585072013830902327173324041e-308	(raw 0x3c018000000000000000)

The result ― the new value of value(rv) ― is in st0; it is 0x3c00fffffffffffff800:
計算の結果、つまり value(rv) の値は st0 にあり、その内容は 0x3c00fffffffffffff800 です。


0
011110000000000
1111111111111111111111111111111111111111111111111111100000000000

This represents the 53 significant bit value

1.1111111111111111111111111111111111111111111111111111 x 2-1023.

The Result of the fstpl Instruction (value(rv))

The fstpl at address 0x8302a62 stores the copy of value(rv) in st0 back to memory, which 
converts it to double-precision. Since the value of st0 is a subnormal number in 
double-precision, it must be rounded to 52 bits. This rounds value(rv) to 2-1022 (which 
ironically makes it normal). In other words, value(rv) remains unchanged!

0x8302a62番地にある fstpl はst0に置かれている value(rv) のコピーを倍精度に変換してメモ
リーに書き戻します。st0の値は倍精度の非正規化数 (subnormal number)なので、52ビットに丸
める必要があります。この丸めは value(rv) を 2-1022 にします
(which ironically makes it normal)。
言い換えると、value(rv) は変更されないままなのです!

What Went Wrong: A Double Rounding on Underflow Error
何が間違ったのか: アンダーフローエラーに対する二段階丸め

You have to look at the calculation of value(rv) + adj more closely to see what happened.

なにが起こったのかを理解するためには、より詳しく value(rv) + adj の計算について
見ていく必要があります。

value(rv) = 2-1022
adj = -1.001001010101110111000100010101011001101111100101101 x 2-1075

If you add them by hand, using binary addition, you get this 104-bit result (I'd have 
shown you my calculation, but alignment of the exponents made the lines too long):

これをbinary addition を使って手計算で加算した場合、その結果は
104ビットの長さを持つものになります。

1.1111111111111111111111111111111111111111111111111110110110101010001
000111011101010100110010000011010011 x 2-1023

The highlighted bit is significant bit 54; since the value of bit 54 and beyond is 
greater than 1/2 ULP, this number, to 53 bits, rounds up to

1.1111111111111111111111111111111111111111111111111111 x 2-1023

As I said above, this is a subnormal number in double-precision, so it must now be 
rounded to 52 bits. The highlighted bit ? bit 53 ? is 1, and all bits beyond are zero. 
This is a halfway case, so the tie is broken with the round-half-to-even rule. In this 
case, that means rounding up to 2-1022.

If the result were rounded only once, directly to 52 bits, it would have been correct:

1.111111111111111111111111111111111111111111111111111 x 2-1023

In other words, this is a double rounding error. It essentially undoes the subtraction. 
The second rounding adds back 2-1074, which is the amount we were really trying to 
subtract in the first place!

The Generated Code With ‘Volatile'

After restoring the ‘volatile' keyword to the source code, I rebuilt PHP and traced 
it again. With the ‘volatile' keyword, the two highlighted lines of source code 
compile to these instructions:

0x8302ba4  call 0x8301290 <ulp>；
           ...
           0x83012b9  fldl -0x10(%ebp)    ; load ulp(value(rv))
           0x83012bc  leave
           0x83012bd  ret
           ...
0x8302ba9  mov -0xac(%ebp),%eax
0x8302baf  fldl -0x38(%ebp)         ; load aadj1
0x8302bb2  fmulp %st,%st(1)         ; adj = aadj1 * ulp(value(rv))
0x8302bb4  fstpl -0x40(%ebp)        ; store adj
0x8302bb7  fldl -0x48(%ebp)         ; load value(rv)
0x8302bba  fldl -0x40(%ebp)         ; load adj
0x8302bbd  faddp %st,%st(1)         ; value(rv) += adj
0x8302bbf  fstpl -0x48(%ebp)        ; store value(rv)

In this code, the fmulp and faddp instructions are separated by a load/store sequence, 
forcing adj to double-precision before adding it to value(rv). This sidesteps the 
double rounding issue.

The Result of the fmulp Instruction (adj)

This is the FPU stack after the fmulp instruction at 0x8302bb2 executes:

st0  -2.8309023271733244206437071197035129e-324 (raw 0xbbcc92aee22acdf2d000)
st1  0                                          (raw 0x00000000000000000000)
st2  0	                                        (raw 0x00000000000000000000)
st3  0	                                        (raw 0x00000000000000000000)
st4  0                                          (raw 0x00000000000000000000)
st5  0	                                        (raw 0x00000000000000000000)
st6  2.2250738585072013830902327173324041e-308  (raw 0x3c018000000000000000)
st7  -0.57298100991279032889735844946699217	(raw 0xbffe92aee22acdf2d000)

The result of the multiplication, adj, is the same as in the unfixed code, except it 
lives in st0:

-1.001001010101110111000100010101011001101111100101101 x 2-1075.

The Result of the fstpl Instruction (value(rv))

The fstpl at address 0x8302bb4 stores the copy of adj in st0 back to memory, which converts 
it to double-precision. Since the value of st0 is a subnormal number in double-precision, 
it must be rounded to one bit, since that's all that remains of the 52 maximum bits of 
subnormal precision at exponent 2-1074.

To see where the rounding occurs, rewrite st0 as

丸めが行われている場所を確認するために st0 を以下のように書き換えます

-0.1001001010101110111000100010101011001101111100101101 x 2-1074

This rounds up to -2-1074, the smallest value double-precision can represent.
The Result of the fldl Instruction (adj)

This is the FPU stack after the fldl instruction at 0x8302bba executes:

st0  -4.9406564584124654417656879286822137e-324 (raw 0xbbcd8000000000000000)
st1  2.2250738585072013830902327173324041e-308  (raw 0x3c018000000000000000)
st2  0	                                        (raw 0x00000000000000000000)
st3  0	                                        (raw 0x00000000000000000000)
st4  0                                          (raw 0x00000000000000000000)
st5  0	                                        (raw 0x00000000000000000000)
st6  0	                                        (raw 0x00000000000000000000)
st7  2.2250738585072013830902327173324041e-308	(raw 0x3c018000000000000000)

This confirms that adj, which is now in st0 as 0xbbcd8000000000000000, contains -2-1074:

1
011101111001101
1000000000000000000000000000000000000000000000000000000000000000

The Result of the faddp Instruction (value(rv))

This is the FPU stack after the faddp instruction at 0x8302bbd executes:
0x8302bbd で faddp 命令を実行した後のFPUスタックはこうなります: 

st0  2.2250738585072008890245868760858599e-308  (raw 0x3c00fffffffffffff000)
st1  0                                          (raw 0x00000000000000000000)
st2  0	                                        (raw 0x00000000000000000000)
st3  0	                                        (raw 0x00000000000000000000)
st4  0                                          (raw 0x00000000000000000000)
st5  0	                                        (raw 0x00000000000000000000)
st6  2.2250738585072013830902327173324041e-308  (raw 0x3c018000000000000000)
st7  -4.9406564584124654417656879286822137e-324	(raw 0xbbcd8000000000000000)

The addition works this time, changing value(rv), making it 0x3c00fffffffffffff000:

0
011110000000000
1111111111111111111111111111111111111111111111111111000000000000

This is 1.111111111111111111111111111111111111111111111111111 x 2-1023, the correctly 
rounded result. (It is 52 bits, so when it's stored back to memory with the subsequent 
fstpl instruction, no rounding occurs.)

So the loop now terminates, after only one adjustment to value(rv).

Summary
まとめ

PHP got stuck converting a decimal number whose double-precision binary floating-point 
representation is DBL_MIN ― 2-1074, a subnormal number at the normalized/unnormalized number 
boundary. Its conversion routine started with an approximation of DBL_MIN and tried to adjust
it by subtracting a number a tad smaller than 2-1074 ― a number too small for a double to 
represent. The addition of the ‘volatile' keyword forced the adjustment to be rounded to 
2-1074, which allowed the subtraction to work as intended ― and avoid a double rounding on 
underflow error.

PHP は、正規化数と非正規化数の境界にある非正規化数であるDBL_MIN - 2^-1074 を表現するような
倍精度の二進浮動小数点数の十進数値への変換で stuck してしまいます。
PHPの変換ルーチンはBDL_MINの近似値から始めて、2^-1074 より小さな tad を引くことで
調整を試みますが、その数値は倍精度で表現するには小さすぎるものなのです。
'volatile' キーワードを追加することで adjustment を強制的に 2^-1074 にします。
この値は意図したように減算が働くようにするもので、アンダーフローにおける
二重丸めを排除するものです。

A Good Quote

I found this quote regarding double rounding on underflow, in an appendix of “What Every 
Computer Scientist Should Know About Floating-Point Arithmetic” titled “Differences Among 
IEEE 754 Implementations” (see subsection “Programming Language Support for Extended 
Precision”, bullet 4):

    “Of course, this form of double-rounding is highly unlikely to affect any practical 
     program adversely.”

:)

Copyright c 2008-2011 Exploring Binary

■_ 短絡評価

これを見れば、最初から今のような動作であろうことは予想できますね。

Chistory
Neonatal C

Rapid changes continued after the language had been named, for example the 
introduction of the && and || operators. In BCPL and B, the evaluation of 
expressions depends on context: within if and other conditional statements that 
compare an expression's value with zero, these languages place a special 
interpretation on the and (&) and or (|) operators. In ordinary contexts, they 
operate bitwise, but in the B statement

    if (e1 & e2) ...

the compiler must evaluate e1 and if it is non-zero, evaluate e2, and if it too is 
non-zero, elaborate the statement dependent on the if. The requirement descends 
recursively on & and | operators within e1 and e2. The short-circuit semantics of 
the Boolean operators in such `truth-value' context seemed desirable, but the 
overloading of the operators was difficult to explain and use. At the suggestion of 
Alan Snyder, I introduced the && and || operators to make the mechanism more 
explicit.

■_ データ構造

あまり知られていないけどイカしたデータ構造は?

language agnostic - What are the lesser known but cool data structures ? - Stack Overflow

There a some data structures around that are really cool but are unknown to most 
programmers. Which are they?

Everybody knows linked lists, binary trees, and hashes, but what about Skip lists, 
Bloom filters for example. I would like to know more data structures that are not so 
common, but are worth knowing because they rely on great ideas and enrich a 
programmer's tool box.

リンクつきリストや、二分木、ハッシュといったものは誰でも知っています。一方で、
たとえばスキップリストやブルームフィルターといったものはそうではありません。
わたしはそういった、一般的ではないけれども知っておくべきデータ構造を
もっと知りたいのです。それは、そのようなデータ構造が great ideas に基づいたものであり
プログラマーの道具箱を enrich するものだからです。

PS: I am also interested on techniques like Dancing links which make interesting use 
of the properties of a common data structure.

EDIT: Please try to include links to pages describing the data structures in more 
detail. Also, try to add a couple of words on why a data structures is cool (as Jonas 
Kölker already pointed out). Also, try to provide one data-structure per answer. This 
will allow the better data structures to float to the top based on their votes alone.

結構盛り上がってますが回答編は略。

■_

2011年03月23日

■_

・ちょっと大きめに表紙絵
買った。まだ読んでない。この表紙絵の場面がどういう局面なのかわからないけど、惹かれるものがあるなあ。一巻では感じなかったけど。なんで?

■_ not 書評

俺のコードのどこが悪い?―コードレビューを攻略する40のルール
藤原克則

まとめてから技術系執筆情報に気づいただよ○|￣|＿

まず、着眼点はいいと思うんですよ。この本の帯に「40のルール」ってのが書かれているんですが、それは

機能充足性
01 if/else 文での判定条件は適切ですか?
02 データ型は適切ですか?
03 ループの終了判定は妥当ですか?
04 switch文のcaseラベル列挙は十分ですか?
05 引数チェックは十分ですか?
06 戻り値チェックは十分ですか?
リソース消費
07 引数チェックが多過ぎませんか?
08 戻り値チェックが多過ぎませんか?
09 関数の局所変数が多過ぎませんか?
10 各行の負荷を把握していますか?
11 似たような処理があちこちにありませんか?
12 不要な条件判定をしていませんか?
13 構造体渡しは必要ですか?
14 不要な関数呼び出しをしていませんか?
15 関数呼び出しのコストを把握していますか?
16 例外に頼り過ぎていませんか?
17 スレッド間同期の方法は適切ですか?
18 アルゴリズムやデータの選択は適切ですか?
19 コンパイラの最適化に期待し過ぎていませんか?
20 ファイルアクセスは妥当ですか?
実行安全性
21 入力データは検証済みですか?
22 スタック消費量は妥当ですか?
23 ヒープ消費量は妥当ですか?
24 スレッド数は十分ですか?
25 ファイル操作は妥当ですか?
26 ネットワークアクセスは妥当ですか?
27 異常が検出された際の挙動は適切ですか?
保守性
28 横に長いコードになっていませんか?
29 空行を適切に挿入していますか?
30 プロジェクトのコーディング規約に従っていますか?
31 自分のコーディングスタイルは一貫していますか?
32 機能を詰め込み過ぎていませんか?
33 変更をどこまで想定していますか?
34 名前付けは適切ですか?
拡張性
35 無駄な拡張性にこだわっていませんか?
36 十分な情報を受け渡していますか?
テスト容易性
37 環境に依存する個所は把握していますあ?
38 テスト環境の高地は簡単ですか?
39 日付と時刻の扱いを考えていますか?
40 どこまで細分化したテストを行えますか?

本書ではこれをもっと掘り下げて書いていっています。が、ところどころ挟まっている「コラム」で致命的ともいえる間違い(勘違い)が。訂正のページもあるみたいですけど(さっきのリンク)、それでいいのかなあという気がしないでも (回収しろとはいわんけど)。

具体的にはこう。まず42ページ。

条件判定の評価順序
C/C++の言語仕様では、次のような条件判定が記述された場合、変数 i/j/k それぞれに対する
判定の順序を、「コンパイラの実装依存」としています。

01.5
   if((i == 1) || (j == 2) || (k == 3)) {...}

「コンパイラの実装依存」というのは、「コンパイラによって処理順序が異なるコードが生成される」
可能性があるということを意味します。
「処理順序が異なるコード」の例として、次のようなバリエーションが考えられます。

   ・i⇒j⇒kで判定(記述順)
   ・k⇒j⇒iで判定(逆順)
   ・生成コードの状態に応じて順序を変更

次のような条件判定の場合を考えて見ましょう(ママ)。
01.6
 条件判定が記述順で実施される分には、このプログラムは期待通りに機能します。
 しかし、記述と逆順に判定される場合には、変数p経由でのf1参照が先に処理されますので、
変数pがNULLだった際には、このプログラムは異常終了してしまいます。
 これらのことから、厳密に言うなら、依存関係を持つ条件判定を、1つの式として記述しては
いけません。
 もっとも、このような評価順序に依存した既存のプログラムが多数あることから、一般的な
用途で使用されるコンパイラは、記述順序通りに処理するコードを生成します。
 以前であれば、「組み込み機器向けには様々なコンパイラ実装がある」という理由が説得力を
持っていましたが、今時は組み込み機器向けの開発であっても、GCCベースのコンパイラを使用する
ケースが増えていますので、この理由も徐々に説得力を失っています。
 しかし、評価順序に依存した条件判定は、言語仕様上では「たまたま動作するに過ぎない」もの
であることは、意識しておく必要があります。

先のリンクで訂正がなされてますが、いくらなんでもこれはといいたくなります。規格から引くと

5.1.23 プログラムの実行
この規格における意味規則の規定では、最適化の問題を無視して抽象計算機の動作として記述する。
ボラタイルオブジェクトへのアクセス、オブジェクトの変更、ファイルの変更、又はこれらのいずれか
の操作を行う関数の呼び出しは、すべて副作用(side effect) と呼び、実行環境の状態に変化を生じる。
式の評価は、副作用を起こしてもよい。副作用完了点(sequence point)と呼ばれる実行環境における
特定の点において、それ以前の評価にともなう副作用は、すべて完了していなければならず、
それ以降の評価に伴う副作用が既に発生していてはならない。(副作用完了点の要約を附属書Cに示す)

6.5.13 論理AND演算子
意味規則 &&演算子の結果の型は、両オペランドの値が0と比較してともに等しくない場合は1、
それ以外の場合は0とする。結果の型は、intとする。
 ビット単位の2項&と異なり、&&演算子は左から右への評価を保証する。第1オペランドの評価の直後を
副作用完了点とする。第1オペランドの値が0と比較して等しい場合、第2オペランドは評価しない。

6.5.14 論理OR演算子
意味規則 || 演算子の結果の型は、両オペランドの値が0と比較していずれか一方でも等しくない場合は1、
それ以外の場合は0とする。結果の型は、intとする。
 ビット単位の2項|と異なり、||演算子は左から右への評価を保証する。第1オペランドの評価の直後を
副作用完了点とする。第1オペランドの値が0と比較して等しくない場合、第2オペランドは評価しない。

このあたりですか。

で、情報ページには

技術系執筆情報

1990 年代初頭に C 言語を習い憶えた当時の 「演算の評価順序は未定義」という記憶から、 論
理演算における評価順序もてっきりコンパイラ依存だと思い込んでいたのですが、 論理演算(+ 
幾つかのケース )における評価順序は、 言語仕様において保証されています。

C99 以前の仕様に関しては、 正式な文書を確認できなかったのですが、 歴史的経緯に関する注
記等が特に無いことから、 ANSI 等で標準化された時点からの仕様のように思われます。

自分が最初に学んだときは～という空気を感じないでもないですがそれはさておき、 ANSI どころかもっと前からそうであるという状況証拠があります。カーニハン大先生の「なぜ私はPascalが嫌いか」から。


3.制御の流れ

 Pascal では制御の流れの欠陥は小さいが、数が多いのです。急所を一突きされて死ぬというよりも、
何千もの切り傷のために死ぬという感じです。
 論理演算子 and と or で評価の順序がはっきりと決まっていません。C ならば&&でも||でも
明確に決まっています。この欠点は、他の多くの言語においてもそうです。ループの制御で最も
しばしば問題になります。たとえば、

     while(i<=XMAX) and (x[i]>0)do ...

というのは、Pascal の非常に悪い使い方になります。なぜなら、iのテストがx[i]のテストより
先に行われるという保証がないからです。

コンパイラーの実装依存であるなんてことならこうは書けないですよね。

■_

もう一個。こっちもすでにほかの人が指摘していますが

p59
column enumを持たない言語

Javaの場合、言語仕様としてenumに相当する定数定義機能を持ちません。
また、caseラベル値に変数を使用することができません。
しかし、変数定義をfinal化することで、その変数は固定値とみなされますので、case
ラベル値として使用することができるようになります。

04.4
public static final int SLEEPING = 0;

public static int INTPROGRESS = 1;

....

    switch(status){
      /* final 化されている変数なので使用可能 */
      case SLEEPING:
        .... 0 向け処理 ....; break;
      /* final 化されていない変数なのでコンパイルエラー */
      case INTPROGRESS:
        .... 1向け処理 ....; break;

えーとまあそのなんだ。

■_

■_ あとでよむ

これとか。

The pursuit of excellence in programming

As I write a series of thoughts on the pursuit of excellence in programming, I must 
preface my essay by asking you to ignore that I wrote these words. I invite you to 
evaluate the opinions and ideas presented here not ad hominem, but rather on the basis 
of their own merits. It would be easy to otherwise mistakenly dismiss them with the 
infamous question posed by Steve Jobs to a blogger: “What have you done that's so 
great?”.

以下略

■_

これとか。しかし Zed、なんつードメイン名を。

Programming, Motherfucker

We are a community of motherfucking programmers who have been humiliated by software 
development methodologies for years.

We are tired of XP, Scrum, Kanban, Waterfall, Software Craftsmanship (aka XP-Lite) and 
anything else getting in the way of...Programming, Motherfucker.

We are tired of being told we're autistic idiots who need to be manipulated to work in 
a Forced Pair Programming chain gang without any time to be creative because none of 
the 10 managers on the project can do...Programming, Motherfucker.

We must destroy these methodologies that get in the way of...Programming, Motherfucker.

■_

2011年03月22日

■_

■_ アドバイス求む

java - coding best practice advise needed - Stack Overflow
I want to understand if the following coding practice is good or bad practice.
以下に示す coding practice が果たして good practice なのか bad practice なのか知りたいのです。

I pass a Value Object to a method as a parameter and the called method is returning 
the same parameter Value Object. I particulary think since it the same object be 
referenced we dont need to put it as return type.

わたしは Value オブジェクトをパラメーターとしてメソッドにわたし、呼び出された
メソッドはパラメーターと同じ Value オブジェクトを返しています。
戻り値の型とする必要のない同一のオブジェクトであるので、これでいいのだろうかと
考えています。

Class A
{

  initStudent()
  {
     Student studentObj = new Student();
   //do some processing

    studentObj = processStudent(studentObj);
  }

 processStudent(Student pObj)
   {
       //do something

      return pObj;
   }

}

java coding-style coding-standards programming-practices coding-practices

it is sometimes needed to get a parameter and return a parameter of the same type, 
even if the manipulation is on the parameter for some reasons:

1. if the object is immutable (i.e. String)
2. if the object actualy can be changed (a good example for it is <T>[] List.toArray(<T>[])
   where if there is not enough space in the parameter array, a new one is created, otherwise,
   the array is written on the parameter)
3. it also implicitly tells the user the value might be changed...
4. don't be afraid to do it if needed, many projects use it (i.e. Apache Lucene)


Indeed you don't need to return the object. Just return nothing.

Class A
{

  void initStudent()
  {
     Student studentObj = new Student();
   //do some processing
    processStudent(studentObj);
  }

 private void processStudent(Student pObj)
   {
       //do something
   }

}

Edit: Also, as adarshr pointed out, the Student object that you created will be of no 
use if it does not "go out" of the initStudent() method. Probably you may 
want to return it, or store it in a Collection instance field.


Yes you shouldn't be returning it. Use access modifiers and return types properly.

public class A
{

  public Student initStudent()
  {
     Student studentObj = new Student();
     //do some processing

     processStudent(studentObj);

     return studentObj;
  }

  private processStudent(Student pObj)
  {
       //do something
  }
}

とまあいろいろ。

■_

ちと時間のたった話題ですが。

PHP Hangs On Numeric Value 2.2250738585072011e-308 - Exploring Binary

PHP Hangs On Numeric Value 2.2250738585072011e-308
By Rick Regan (Published January 3rd, 2011)

http://www.exploringbinary.com/php-hangs-on-numeric-value-2-2250738585072011e-308/

I stumbled upon a very strange bug in PHP; this statement sends it into an infinite loop:
わたしはPHPのある非常に奇妙なバグに遭遇しました。
次の文が無限ループに落ち込むのです:

<?php $d = 2.2250738585072011e-308; ?>

(The same thing happens if you write the number without scientific notation ― 324 decimal places.)

I hit this bug in the two places I tested for it: on Windows (PHP 5.3.1 under XAMPP 1.7.3),
and on Linux (PHP Version 5.3.2-1ubuntu4.5) ― both on an Intel Core Duo processor. I've
written a bug report.

わたしは自分がテストした環境の二つWindows と Linux でこのバグに当たりました。
両方とも Intel の Core Duo プロセッサー上です。わたしはバグレポートを書きました。

What's Going On?
何が起きているのか?

As a string, 2.2250738585072011e-308 causes no problems; it's when it's treated as a numeric
value that the bug hits.

文字列としては 2.2250738585072011e-308 は問題を起こしたりはしません。
数値としてその値を扱おうとしたときにバグが発現するのです。

This works:

<?php $d = '2.2250738585072011e-308'; echo $d; ?>

but this doesn't:

<?php $d = '2.2250738585072011e-308'; echo $d + 0; ?>

I don't have a debug build of PHP available, but I did poke around in the source code
(version 5.2.8). I found zend_strtod.c, which appears to be based on a very old version of
David Gay's strtod() function. I don't know if zend_strtod() is even called in this case,
but it was simple enough for me to isolate it and run it outside of PHP ― under Visual C++
in fact. It did not hang on that input. (Update: zend_strtod() is called.) What's Special
About 2.2250738585072011e-308?

2.2250738585072011e-308 represents the largest subnormal double-precision floating-point number;
written as a hexadecimal floating-point constant, it's 0x0.fffffffffffffp-1022.
2.2250738585072011e-308 is one of five 17-digit decimal values that convert (correctly) to 0x0.fffffffffffffp-1022:

2.2250738585072011e-308 は非正規化数の倍精度浮動小数点数で最大の値を表現しています。
十六進の浮動小数点定数で表記すると 0x0.fffffffffffffp-1022 となります。
2.2250738585072011e-308 は 0x0.fffffffffffffp-1022 へと(正しく)変換される
17桁の十進数値五個のうちの一つです。

* 2.2250738585072007e-308
* 2.2250738585072008e-308
* 2.2250738585072009e-308
* 2.2250738585072010e-308
* 2.2250738585072011e-308

Only 2.2250738585072011e-308 causes the problem. It happens to be the largest of the five
decimal values, so I guess that matters somehow.

If you have any thoughts on what the bug is, please let me (or PHP) know.

Is This Serious?

I've gotten no response to my bug report yet, but I can't help but wonder: is this serious?
Could you bring down a Web server by entering this constant in a PHP-based form?

Update

Yes, it's serious.

1/4/2011: There is a lot of discussion about this, in addition to the comments below;
for example, see

* Reddit.
* Slashdot.
* Hacker News.
* My PHP bug report.

And as pointed out in the comments, equivalent forms of the number also cause the problem:

* Variations on the placement of the decimal point: 0.22250738585072011e-307,
22.250738585072011e-309, 22250738585072011e-324, etc.
* Leading zeros: e.g., 02.2250738585072011e-308.
* Trailing zeros: e.g., 2.22507385850720110e-308.
* Leading zeros in the exponent: e.g., 2.2250738585072011e-0308.
* (Combinations of the above)

You can also alter the full 324 decimal place version of the number with leading and
trailing zeros.

1/22/2011:
I thought of another set of numbers that cause the same problem ― numbers of 18 digits
or more but with the same 17 digit prefix. For example:

* 2.22507385850720111e-308
* 2.225073858507201123e-308
* 2.22507385850720113099e-308
* 2.225073858507201100001e-308
* 2.2250738585072011123456789012345678901234567890123456789e-308
* etc.

In decimal these numbers are distinct, but they all map to the same floating-point number.

It seems the only ones that cause a problem though are those whose 18th digit is 0, 1,
2, or 3 (2.22507385850720114e-308 doesn't cause the problem, for example).

Update: The Bug Has Been Fixed

The bug fix was released on 1/6/2011; this is from the PHP news release:

This release resolves a critical issue … where conversions from string to double might
cause the PHP interpreter to hang on systems using x87 FPU registers.

The problem is known to only affect x86 32-bit PHP processes, regardless of whether the
system hosting PHP is 32-bit or 64-bit.

Update: An Analysis of the Bug and Its Fix

I've found the specific cause of the bug, and why the fix fixes it; please see my article
“Why “Volatile” Fixes the 2.2250738585072011e-308 Bug.”

Why “Volatile” Fixes the 2.2250738585072011e-308 Bug - Exploring Binary

2011年03月21日

■_

連休中、積読の解消どころか以前読んでた「ACMチューリング賞講演録」を読み返してたり。や、バッカス先生のとかヴィルト先生のが読みたかったのですね。にしても、この本今入手不可なんですよねえ (Amazon.co.jp： ACMチューリング賞講演集: 赤摂也: 本)。復刊リクエストも出てるみたいだけど、絶版状態でほっとくくらいなら web に上げてほしいですよね。原文がそうなってるんだし。あと、この本が出たあとも bit 誌には毎年の講演の翻訳が掲載されたわけで (bit|ACMチューリング賞講演リスト - memo.ptie.org) こっちもどうにかして欲しかったり。休刊してからの分は…どうしよう。

面白い内容なんで、訳して紹介したいところなんだけど © 2011 Pearson Education, Inc. All rights reserved. なんでちょっと躊躇したり。 Influential Programming Languages, Part 1: ALGOL Influential Programming Languages, Part 2: Simula Influential Programming Languages, Part 3: Smalltalk Influential Programming Languages, Part 4: Lisp

■_

Twitter / @Faith and Brave: Masterminds of Programming ...

Masterminds of Programmingを読むと、Objective-C設計者の老害っぷりがわかる。

がちょっと気になったり。あの本の Objective-C の章でインタビュー受けているのは二人いるのだけど、ふつー(ってなんですか) Objective-C の作者(設計者)として挙げられるのはおそらく Brad Cox の方。 Brad Cox, Ph.D.

んで、ざっと調べても

The A to Z of programming languages: Objective-C - a-z of programming languages, A to Z of programing languages - Computerworld

Do you have any regrets about maintaining the language i.e., any problems you didn't 
resolve in the language you wish you had?

No. Lack of garbage collection was a known issue from the beginning, but an 
inescapable one without sacrificing the reasons people choose C in the first place.

Was there any apprehension to selling the rights to Objective-C to Steve Jobs' NeXT in 1995?

Not really.

Do you think Objective-C would still be around today if it weren't for NeXT's acquisition?

Probably not.

How do you feel Objective-C 2.0 added to your original language? Can you still call it 
yours in any sense?

I've never thought of as “mine” in any sense, certainly not emotionally. Its a 
soldering gun. And lots of other people were involved in building it, not just me.

Have you played any significant part Objective-C's evolution since the acquisition?

Not really. I've moved on to larger granularity components, particularly SOA and OSGI.

Objective-C - Wikipedia

Objective-Cは、1983年にBrad Coxによって開発され、そのコンパイラやライブラリを支援する
ためにStepstone社を創立した。Stepstone社は、Objective-Cに力を注いだが、それはマイナー
な存在であった。Objective-Cが認知され始めるきっかけは、1985年、アップルコンピュータを
去ったスティーブ・ジョブズが、m68k機であるNeXTコンピュータとNeXTSTEPオペレーティングシ
ステムの開発を行うNeXT Computer社を創立したことに始まる。

そのマシンのユーザインタフェースは、Display PostScriptと Objective-Cで書かれた
Application Kitにより提供され、Objective-CはNeXTコンピュータの主力言語となった。その後
の歴史は、主にNeXT社とともにあり、GCCをベースにしたObjective-Cサポートが行われ、プロト
コルの導入など文法の拡張なども行われている。NeXT社による多くの成果は、GCCに還元されて
いる。

1995年には、NeXT社がStepstone社からObjective-C言語と、その商標に関する全ての権利を買い
取っている。1997年初頭、Apple社がNeXT社を買収し、NeXT社CEOのジョブズがApple社のCEOに復
帰したのに伴い、その技術をベースにしたオペレーティングシステムMac OS X上における、主力
開発言語の一つとなり、Cocoaフレームワークのコア言語として採用されている。Mac OS X 
v10.5からは一部言語仕様の変更が行われObjective-C 2.0と呼ばれる（詳細は#Objective-C 2.0
を参照）。

長らくNeXT及び、その後継であるMac OS Xの専用言語に近い状態だったが、2007年に入り、アッ
プル社の携帯電話iPhoneにおいてCocoa Touchフレームワークの記述言語として採用。iPhone OS
の普及にともない、習得者の人口が増える傾向にある。

Objective-C - Wikipedia, the free encyclopedia
History

Objective-C was created primarily by Brad Cox and Tom Love in the early 1980s at their 
company Stepstone. (略)

Love and Cox eventually formed a new venture, Productivity Products International (PPI), 
to commercialize their product, which coupled an Objective-C compiler with class 
libraries. In 1986, Cox published the main description of Objective-C in its original 
form in the book Object-Oriented Programming, An Evolutionary Approach. Although he 
was careful to point out that there is more to the problem of reusability than just 
the language, Objective-C often found itself compared feature for feature with other 
languages.

Popularization through NeXT

After Steve Jobs left Apple Inc., he started the company NeXT. In 1988, NeXT licensed 
Objective-C from StepStone (the owner of the Objective-C trademark) and released its 
own Objective-C compiler and libraries on which the NeXTstep user interface and 
interface builder were based. While the NeXT workstations failed to make a great 
impact in the marketplace, the tools were widely lauded in the industry. This led NeXT 
to drop hardware production and focus on software tools, selling NeXTstep (and 
OpenStep) as a platform for custom programming.

The GNU project started work on its free clone of NeXTStep, named GNUstep, based on 
the OpenStep standard. Dennis Glatting wrote the first GNU Objective-C runtime in 1992. 
The GNU Objective-C runtime, which has been in use since 1993, is the one developed by 
Kresten Krab Thorup when he was a university student in Denmark. Thorup also worked at 
NeXT from 1993 to 1996.

After acquiring NeXT in 1996, Apple Computer used OpenStep in its new operating system, 
Mac OS X. This included Objective-C and NeXT's Objective-C based developer tool, 
Project Builder (later replaced by Xcode), as well as its interface design tool, 
Interface Builder. Most of Apple's present-day Cocoa API is based on OpenStep 
interface objects, and is the most significant Objective-C environment being used for 
active development.

Apple 主導になってからどのように変わったのかはわからないけど、現状の Objective-C についての不満の原因をすべて Cox (と Love) におっ被せるのはどうしたもんかなあと。例の本の、Love のインタビューについては言語以外のところでこの辺かなあと思うところがなくはなかったけども、なんでもかんでも「老害」(って便利な言葉だよね)と片付けるのは良くないんじゃないかなあ。

こういうのもあった。

Rubyist Magazine - Ruby Conference 2004 レポート

Brad Cox による今年の RubyConf のキーノートスピーチです。 前半と後半に分けて、前半が 
Objective-C とそこに至るまでの話、 後半がそれ以降、現在までの彼の関心事を話す、と前置
きしました。

まず Objective-C は彼にとってはゴールではなかった、という点を 強調していました。彼の目
標はコラボレーションツールを作ることであり、 その為に必要だったので言語を設計した、と
いうことです。 (cooperation tool とも言ってました。グループウェアみたいなものでしょう
か?) そして、Objective-C の言語仕様や特徴をごく簡単に説明し、なんと 開始から 20 分ほど
で前半部分が終了してしまいました。 前ふりの時間も含めれば言語の話は 15 分もなかったの
ではないでしょうか。

(略)

RubyConf という世界でも有数の言語オタクが集合する機会であるにも 関わらず、反応を見てい
る限りは Objective-C の話がほとんど無くても 皆さんたのしんでおられたようです。Rubyist 
の性格がわかったような。 私には、Brad Cox にとっては Objective-C は完全に過去のものに 
なってしまっているんだなあ、という印象が強く残りました。

Java のライブラリについても弁護したいけどそれはまた (気力がわかない)。

まあ、「あの本読んでいてわからんようならおまえ自身も老害の一人だ」と言われればそのとおりなんですがね ○|￣|＿

■_ Don't Distract New Programmers with OOP

オブジェクト指向プログラミングで新米プログラマーの取り乱させるな。くらい?

prog21: Don't Distract New Programmers with OOP
Don't Distract New Programmers with OOP

The shift from procedural to OO brings with it a shift from thinking about problems 
and solutions to thinking about architecture. That's easy to see just by comparing a 
procedural Python program with an object-oriented one. The latter is almost always 
longer, full of extra interface and indentation and annotations. The temptation is to 
start moving trivial bits of code into classes and adding all these little methods and 
anticipating methods that aren't needed yet but might be someday.

When you're trying to help someone learn how to go from a problem statement to working 
code, the last thing you want is to get them sidetracked by faux-engineering busywork. 
Some people are going to run with those scraps of OO knowledge and build crazy class 
hierarchies and end up not as focused on on what they should be learning. Other people 
are going to lose interest because there's a layer of extra nonsense that makes 
programming even more cumbersome.

At some point, yes, you'll need to discuss how to create objects in Python, but resist 
for as long as you can.

(If you liked this, you might like How I Learned to Stop Worrying and Love Erlang's Process Dictionary.)
permalink March 16, 2011

■_ もしもわたしがJでプログラミングしたら

Obviously he is a stoner 何だこの言い回し。

Mind if I do a J? : programming

Obviously, you're not a golfer.


excuse me? indeed, i'm not a golfer.


Code golf.


ah, thx; indeed, i'm no golfer, though i like economy of input in my shell (and that's 
partially why i looked at J) there is always M-/ in Emacs both for sanity's sake of 
reading the code and not needing to be touch typer type.



"Mind if I do a J?" and "Obviously, you're not a golfer." are 
quotes from the movie The Big Lebowski.


Obviously he is a stoner


You say that like it's a bad thing.


No objection here, just make sure you pass it to the left.


Oh my god. All the operators in J pass their result on the left hand side. That's brilliant.

Are you sure this is the biggest problem? What about this one? 6%3 2 or even... _5+5 0 "Everyone uses the slash for division, percent for the modulo, and minus sign for negation. That's too mainstream for us." Fucking hipster language.



My memory of J is a bit rusty, but I believe there are reasons for those points.

"/" gets appropriated to the fold adverb. This use comes directly from APL. 
So it's not as if it's totally without reason, though that doesn't necessarily mean 
it's a good reason.

"%" gets promoted to division duty, and "|" takes over his old job. 
Which, while alien to programmers, it makes a teensy bit more sense for those familiar 
with number theory and whatnot, where Mr. Pipe is used to indicate divisibility. So, 
it's fairly arbitrary given that divisibility is separate from modulo, but they're 
sorta close so I guess it's a bit less arbitrary than using Mr. Percent.

Lastly, the underscore is part of the number syntax, rather than a verb like the 
hyphen. For example:

   _1 2 3
_1 2 3
   - 1 2 3
_1 _2 _3

Given that J is an array language, this behavior makes sense to me. As you pointed out, 
the minus sign is generally used for negation. This seems like a sensible way to 
extend that notion to arrays.

That being said, it's fine to disagree. J does stray quite a bit from the norm. And 
it's a little fucking crazy at times.

I highly recommend Learning J to anyone interested in the language.

Also, if you want some interesting visual stuff, type "require 'viewmat'" 
into the prompt, then apply that verb to matrixes. Damn.

    6%2 NB. That's division by the way! WHY?

Well, Ken's previous language, APL, wrote that as 6÷2 instead. But people complained 
that they didn't have a ÷ sign on their keyboard, or in their font, or in fact in 
their character encoding at all. So when he designed J, he used ASCII, and % is the 
closest thing to ÷.

NB. comes from Latin by way of JOSS.


Because it uses / for reduce...

e.g.:

    +/ 1 2 3 4 5
15

Now, / is the adverb, probably to keep the dictionary more consistent or something.

"APL is a mistake, carried through to perfection. It is the language of the 
future for the programming techniques of the past: it creates a new generation of 
coding bums." -Djikstra.

That said, if this is the dumbest language you've seen, you need to get out more


What's the smartest language you've ever seen? Java?


COBOL.


Java is pretty good. They made a few (self-admitted) design mistakes, like their 
implementation of generics and auto-boxing, but it is generally a very good language.

Python is quite good as well, but in very different ways, and you need a paradigm 
switch if you want to program effectively in Python.

C++ is quite good, as long as you impose strict guidelines on which features are not 
to be used (like exceptions) and which ones are to be used in very limited 
circumstances.


My post was tongue-in-cheek. Java is not a bad language, but my point was, just 
because you don't understand what a language is useful for doesn't mean it's 
"dumb". I'm pretty certain J is very useful to some people.

Also, learning stuff that you don't understand in the first place is a good way to 
learn different ways of doing things and to keep your creativity sharp.


You don't know mathematical speed and power until you've worked with J or APL.

■_

ぼつ。

■_

一つ前へ 2011年3月(中旬)
一つ後へ 2011年4月(上旬)

ホームへ

リンクはご自由にどうぞ

メールの宛先はこちら kbk AT kt DOT rim DOT or DOT jp

ときどきの雑記帖 混迷編

ときどきの雑記帖混迷編