# ときどきの雑記帖 混迷編

### 2011年03月31日

>

#### ■_ The End of Monkeypatching

モンキーパッチ止めようぜ。かな?

The End of Monkeypatching
The End of Monkeypatching

By Xavier Shay / March 29, 2011

Monkey-patching is so 2010. We're in the future, and with Github and Bundler there is now
rarely a need to monkey-patch Ruby code in your applications.

Monkey-patching is the dangerous-yet-frequently-useful technique of re-opening existing
classes to change or add to their behavior. For example, if I have always felt that Array
should implement the sum method, I can add it in my codebase:

モンキーパッチは既存のクラスの変更や振る舞いの追加を行うために
クラスを再オープンする dangerous-yet-frequently-useful technique です。
たとえば仮にわたしがいつもArrayにはsumメソッドが実装されているべきだと感じていたならば
それを自分の codebase で追加できます:

class Array
def sum
inject {|sum, x| sum + x }
end
end

That is a monkey-patch. Of course, when I require activesupport it also adds a sum method to
Array though its version has an arity of one and takes a block. This conflict can cause hard
to track down errors and is why monkey-patching is to be used with caution.

これがモンキーパッチです。もちろん、わたしが activesupprt を require したときにも
(activesupprt 版の)引数一つとブロックをとる sum メソッドが Array に追加されます。この

Thankfully, the spread of this abuse is minimal and most developers understand the risks.
More frequently, monkey-patching is used to quickly fix bugs in existing libraries by
reopening a class and replacing an existing method with an implementation that works
correctly. This is often a fragile solution, relying on on sometimes complex techniques to
override exactly the right bit of code, and also on the underlying library not be refactored.

ありがたいことに、spread of this abuse は最小限であり、大半の developer はモンキーパッ
チングを使うことのリスクを理解しています。より頻繁には、モンキーパッチングは既存のライ
ブラリにあるバグをクラスの再オープンして既存のメソッドを正しく動くコードに差し替えるこ
とでさっさとバグを修正するのに使われています。これは大抵の場合 fragile solution (不安

ており、またライブラリがリファクタリングされないことにも依存しています。

In the dark ages when it was troublesome to own or release your own versions of gems, this was
far easier and more robust solution: fork the offending code, fix it at the source, and set up
your application dependencies to use your new code directly. All without having to package a
new gem!

モンキーパッチを駆使することが行使できる唯一の cheap solution でした。
しかし現在、modern なRubyアプリケーションは格段に簡単に、かつ頑強な solution を
とれるようになっています。問題のあるコードを fork して、ソースコードを修正してから
それを自分のアプリケーションに直接新しいコードをセットアップするのです。
すべてはパッケージやgemを持つ必要なく行えます。

The first few steps of this process are mostly self-explanatory, but I have documented them
below anyway. If you are already old hat at this stuff, feel free to skip directly to step 4.

このプロセスの最初の数ステップは mostly self-explanatory ですが、

もしすでにこのことについて知っているのであれば、ステップ4まで飛ばしてしまっても

Step 1: Fork the Library
ライブラリを forkする

It is rare to find a Ruby gem or library that isn't on Github these days. After locating the
library, always check the network graph to try and find other popular forks. Often the problem
you are trying to solve has already been fixed by another developer. In that case, you can skip
straight to step 3. Otherwise, fork the code base to your own GitHub account.

Clone your fork and make whatever changes you need. If you are feeling generous, add
an appropriate test to the code base as well so it can be contributed back to the
original fork, but as long as you have a test somewhere (such as in your main app) for
the desired behavior you will be fine.

Gemfile を変更する

Point your Gemfile at the new code:

# Gemfile
# From this
gem 'rails'

# To this
gem 'rails', :git => 'git://github.com/xaviershay/rails', :branch => 'important-fix'

And reinstall your gems by running bundle.

bundle を実行して gems をインストールします。

Step 4: Document

This step is important. There is no excuse for skipping it. You need documentation in
three places:

1. A note at the top of the README in your fork, documenting the changes. Any developer
an stumble across a public fork, and there is nothing more frustrating than trying to
figure out whether a fork already solves your problem. At the very least, a “here be
dragons” note will be appreciated.

2. The place in your code base that depends on the fork. You can expect other developers
to be familiar with rails and the standard gems. They won’t be familiar with the

3. The Gemfile. Make a note above your gem line as to why a fork is required.
Provide enough information that a future developer will know when or if it would be
appropriate to upgrade or switch back to the main gem. Here are some real examples
from some of my projects:

# An experimental fix for memory bloat issues in development, if it works
# I will be patching to core.
gem ...

# 1.1 requires rubygems > 1.4, so won't install on heroku. This fork removes
# that dependency, since it is actually only required for the development
# dependencies.
gem ...

# Need e86f5f23f5ed15d2e9f2 in master and us to upgrade to dm-core 1.1
# before we switch back. Should be in 1.1.1 release.
gem ...



#### ■_

Non-trivial monads in a functional programming language without closures (functions are first class but can't be partially applied)? : haskell

This is a generic FP lang design question and not really Haskell, but r/functional's
latest post is 1 year ago so I put it here instead.

I'm designing a language with a statically-allocatable restriction, based on some
research about using FP langs for behavioural hardware synthesis. To summarize, the
limitations imposed in order to make behavioural synthesis tractable are:


#### ■_

A Quick Look at the Rust Programming Language

2011-03-31

A Quick Look at the Rust Programming Language

The Rust Programming Language is a systems programming language being developed by
Mozilla. It was announced last year and has seen quite a bit of development since then.

Rust は Mozzila によって開発されているシステムプログラミング言語です。
その開発は昨年アナウンスされており、以後開発が続いていました。

I've only been lightly following the development over the past year but recently
decided to spend a bit more time looking at it. The following is a look at the
language and implementation from someone who isn't heavily involved in the
development so don't take anything I write as gospel. Hopefully I don't get too many
things wrong.

The Rust Language FAQ lists the following summary of features:
Rust のFAQ では以下のような機能のサマリをあげています:

* Memory safe. No null pointers, wild pointers, etc. Automatic storage management.
メモリー安全。ナルポインターや wild ポインターといったものはありません。
自動的にメモリー管理されます。

* Mutability control. Immutable by default. No shared mutable state across tasks.
Mutabliity の制御。デフォルトでは immutable であり、タスクをまたがって共有される
mutable state はありません、

* Dynamic execution safety: task failure / unwinding, trapping, logging. RAII / dtors.
安全な動的実行

* Typestate system: ability to define complex invariants that hold over data structures.

* Explicit memory control. Layout and allocation control. Interior / value types.

* Very lightweight tasks (coroutines). Cheap to spawn thousands-to-millions.
非常に軽量なタスク(コルーチン)。

* Stack iterators (effectively lambda-blocks w/o heap allocation).
スタックイテレーター (ヒープの割り当てを行わない効率の良いラムダブロック)

* Static, native compilation. Emits ELF / PE / Mach-o files.

* Direct and simple interface to C code (switch stacks and call, ~8 insns).
Cのコードに対する直接的で単純なインターフェース
(スタックの切り替えと呼び出しで最大8命令)

* Multi-paradigm. pure-functional, concurrent-actor, imperative-procedural, OO.

* First class functions with bindings.

* Structurally-typed objects (no nominal types or type hierarchy).

* Multi-platform. Developed on Windows, Linux, OSX.
マルチプラットフォーム

* UTF8 strings, assortment of machine-level types.
UTF8 文字列

* Works with existing native toolchains. GDB / Valgrind / Shark / etc.
既存の native ツールチェインと一緒に動作する

* Practical rule-breaking: can break safety rules, if explicit about where and how.

Conclusion

Rust is an interesting language and there's quite a bit more to it than I've covered
here. I just picked random features to try. I'm looking forward to rustc being more
complete and trying out some of the language features in more ‘real world' examples
to get a feel for it.

Rust はわたしがここで紹介したよりもずっと面白い言語です。
わたしは単にランダムにいくつかの機能を取り上げようとしただけです。



#### ■_ オススメ本

推薦図書/必読書のためのスレッド 61

387 デフォルトの名無しさん [sage] 2011/03/31(木) 21:05:15.81 ID: Be:
>>385
どういたまして＾＾

未だに良書と出会って無いんだが、アルゴリズムはコード丸暗記じゃ無く、理論的に覚えた方が良い

理論的にも理解し易く(素人でもアルゴリズムの原理を理解しやすい)、実際に動かせるコードになるからね
(実は自分が手元に置いておきたい)

理論的に理解できれば、新しい言語への移植も容易になる

命令の最適化よりも、アルゴリズムによる最適化の方がパフォーマンスに大きく貢献する。と言うのは覚えておいて欲しい

388 デフォルトの名無しさん [sage] 2011/03/31(木) 21:40:01.63 ID: Be:
>>387

辞典ではないが、アルゴリズムの解説本は出てる

Pearls of Functional Algorithm Design

これなんか、関数型でのアルゴリズムの考え方を

各トピックで、まず最初に仕様を関数で表現したり、
拙いが直感的なアルゴリズムを解説してから、
徐々に効率を上げていく解説方法なんで、ワクワクしながら読める
またその効率が、なぜ、どのくらい上がるのかの説明も理論的だ

389 デフォルトの名無しさん [sage] 2011/03/31(木) 21:47:25.30 ID: Be:
>>388
洋書の方か!!
最近探してなかった
thanks

日本語に翻訳されたら、珠玉の一冊として紹介されるだろうに。。。

390 デフォルトの名無しさん [sage] 2011/03/31(木) 21:51:33.83 ID: Be:
>>388
kindleじゃ出てないんだね。。。
いや、買うけど


#### ■_ 制限

Twitter / @木村屋: Cの標準関数のヘッダファイルの名前が変に短いのは、処 ...

Cの標準関数のヘッダファイルの名前が変に短いのは、処理系の制限が原因だったのだろうか。


Twitter / @M教授: @kimuraya osのファイルシステムの最大名前 ...

@kimuraya osのファイルシステムの最大名前長の制限


これどうなんだろう。 FORTH の「処理系の名前」の長さがそれに引きずられたとか聞いた覚えはあるけど、 関数名だしなあ。 ライブラリを一関数一ファイルで作っていたとしても、 UNIXって6文字とかに制限はなかったような (System V で14文字とかいうのを見た覚えはある)。 誰か情報を持ってたら教えてください。

### 2011年03月30日

#### ■_

ヴァンデの反乱とか知らなかった ヴァンデの反乱 - Wikipedia

#### ■_

FAQ: What programming language should I learn first?

FAQ: What programming language should I learn first?

Posted on Mar 28th, 2011 in FAQs, Programming Languages, Python

There are hundreds of different programming languages out there. As a newcomer you can
ignore the fact that most of them exist. However, even if we narrow the list to just a
dozen mainstream languages, deciding on what programming language to learn first can
be a daunting task. You might find yourself asking, should I learn C, C++, Java, C#,
or PHP first? If you ask ten programmers this question, you'll probably hear ten
different answers. Here is my take.

Much as with human languages, programming languages are used to communicate.
Interestingly they still involve communication between people, whether other
programmers will end up reading/modifying/enhancing your code or you'll do at a later
point in time. Unlike natural languages however, programming languages are
unequivocally understood by computers, thanks to the aid of interpreters, compilers,
and similar types of software.

As such, my suggestion would be to start with Python, and use it as a tool to learn
the general craft of programming. Learning Python is fun, easy, and useful. You'll be
able to use it for a wide array of projects in several environments (scripting, web,
scientific research, etc…).

わたしの提案は、Python から始めること。そしてプログラミングの general craft を学ぶための
ツールとして Python を使つというものです。Python を学ぶことは楽しく、簡単で、有用です。
Python を学ぶことであなたは、いくつかの環境 (スクリプティング、web、科学研究などなど)
における wide array of project のために Python を使えるようになるでしょう。

There are a variety of free tutorials on the web, but if you want a more
rigorous/systematic/academic introduction, I highly recommend “Python Programming: An
Introduction to Computer Science (2nd Edition)” (USA | UK | Canada).

web 上には、様々な無料で手に入るチュートリアルがあります。しかしあなたがより
わたしは “Python Programming: An Introduction to Computer Science (2nd Edition)”
を強くお薦めします。

Once you have learned the fundamentals of programming, have a decent command of the
Python language, and have gained some experience with practical Python projects, you
should be better armed to evaluate and pick up other languages and frameworks based on
the projects you intend to develop or contribute to in the future (Open Source
projects are awesome for this purpose).

プログラミングの基礎 (fundamentals) を学び、



#### ■_ これはない

OKWaveから丸投げネタ。

awkでのsh処理について | OKWave

awkでのsh処理について

HP-UX環境、UNIXです。

1行目の11カラム目にOUTが含まれているかつ2行目の11カラム目にINが含まれている行だけ
ファイルに出力するという処理を以下のように考えたんですが、うまくいきません。

awk'{m == NR % 2}
m==1{if($11~ "OUT")} && m==0{if($11~ "IN") print $0} ' [ファイル名] 文法的に誤っていますでしょうか？？ 回答宜しくお願い致します。  質問者が選んだベストアンサー べたなロジックですが cat ファイル名 | awk '\ BEGIN { FILE=-1 } { if (substr($0,11,3)=="OUT") {
LINE=NR
}
if (NR==(LINE+1) && substr($0,11,2)=="IN") { print$0
}
}
END {}' > 出力ファイル名

ではいかがでしょうか。(インデントのために全角空白が入っています)

「sh処理」は何を言っているのか分かりません。


ANo.1

そのエラーメッセージは読んでますか?

とりあえず

> awk'{m == NR % 2}

kと'の間がつまっているのは元からですか?貼り付けたときに変わていまったのですか?

> m==1{if(11~ "OUT")} && この&&はなんでしょう? なお、2行1セットでやりたいのなら、このやりかたでは難しいでしょう。  色々とすいません。kmeeさん。 kと'の間がつまっているのは ⇒貼り付け時です。 メッセージは ⇒syntaxエラー系のメッセージですね。内容は理解出来ませんでした。 2行1セットのやり方はご存知ですか？？ 投稿日時 - 2011-03-29 20:03:10  なんで getline を使わんとですか。 ### 2011年03月29日 #### ■_ イライラが止まらない～ #### ■_ Python の dictionary や、まあ。ハッシュテーブルなんですが。 日曜日に参加したソースコードリーディングで dictobject のところを読んで 非常に面白いものだということを知りました。 ああ、今日もまとめを書いている時間ががががががおかいがー♪ まず、テーブルの大きさが素数ではありませんでした。 以前からテーブルの大きさが素数ではなくてもハッシュ関数がきちんとしたものであれば 問題ないという話は何度か目にしていましたが、実際の処理系で2のべきの大きさを 使っているのをみたのは多分初めてな気が。 そしてもうひとつの興味深い点として、ハッシュ値の衝突の対処として チェーン法を使っておらず、オープンハッシュアドレッシング(もしくはクローズドハッシュ)を使っているということです。 ドキュメントがあったり、コメントにいろいろ書かれているので読んでみると良いかも。 /* Major subtleties ahead: Most hash schemes depend on having a "good" hash function, in the sense of simulating randomness. Python doesn't: its most important hash functions (for strings and ints) are very regular in common cases: >>> map(hash, (0, 1, 2, 3)) [0, 1, 2, 3] >>> map(hash, ("namea", "nameb", "namec", "named")) [-1658398457, -1658398460, -1658398459, -1658398462] >>> This isn't necessarily bad! To the contrary, in a table of size 2**i, taking the low-order i bits as the initial table index is extremely fast, and there are no collisions at all for dicts indexed by a contiguous range of ints. The same is approximately true when keys are "consecutive" strings. So this gives better-than-random behavior in common cases, and that's very desirable. OTOH, when collisions occur, the tendency to fill contiguous slices of the hash table makes a good collision resolution strategy crucial. Taking only the last i bits of the hash code is also vulnerable: for example, consider the list [i << 16 for i in range(20000)] as a set of keys. Since ints are their own hash codes, and this fits in a dict of size 2**15, the last 15 bits of every hash code are all 0: they *all* map to the same table index. But catering to unusual cases should not slow the usual ones, so we just take the last i bits anyway. It's up to collision resolution to do the rest. If we *usually* find the key we're looking for on the first try (and, it turns out, we usually do -- the table load factor is kept under 2/3, so the odds are solidly in our favor), then it makes best sense to keep the initial index computation dirt cheap. The first half of collision resolution is to visit table indices via this recurrence: j = ((5*j) + 1) mod 2**i For any initial j in range(2**i), repeating that 2**i times generates each int in range(2**i) exactly once (see any text on random-number generation for proof). By itself, this doesn't help much: like linear probing (setting j += 1, or j -= 1, on each loop trip), it scans the table entries in a fixed order. This would be bad, except that's not the only thing we do, and it's actually *good* in the common cases where hash keys are consecutive. In an example that's really too small to make this entirely clear, for a table of size 2**3 the order of indices is: 0 -> 1 -> 6 -> 7 -> 4 -> 5 -> 2 -> 3 -> 0 [and here it's repeating] If two things come in at index 5, the first place we look after is index 2, not 6, so if another comes in at index 6 the collision at 5 didn't hurt it. Linear probing is deadly in this case because there the fixed probe order is the *same* as the order consecutive keys are likely to arrive. But it's extremely unlikely hash codes will follow a 5*j+1 recurrence by accident, and certain that consecutive hash codes do not.  テーブルの大きさを2のべきにして、衝突したときの次候補の位置を j = ((5*j) + 1) mod 2**i という式を使って求めているのですが、この式を使うと同じ位置を二度以上通ることなく すべての要素を回って元の位置に帰ってくるらしいのですね(数学的な証明とかわかりません)。 def travel(i) j=0 ary=[] begin j = ((5*j) + 1) % 2 ** i ary.push(j) end until j==0 ary end 2.upto(20) do |i| pat = travel(i) puts "length=#{pat.length}" puts pat.join('->') if i<9 p pat.sort == [*0..2**i-1] end  とこんなスクリプトで試してみました。 length=4 1->2->3->0 true length=8 1->6->7->4->5->2->3->0 true length=16 1->6->15->12->13->2->11->8->9->14->7->4->5->10->3->0 true length=32 1->6->31->28->13->2->11->24->25->30->23->20->5->26->3-> 16->17->22->15->12->29->18->27->8->9->14->7->4->21->10->19->0 true length=64 1->6->31->28->13->2->11->56->25->62->55->20->37->58->35-> 48->49->54->15->12->61->50->59->40->9->46->39->4->21->42-> 19->32->33->38->63->60->45->34->43->24->57->30->23->52-> 5->26->3->16->17->22->47->44->29->18->27->8->41->14->7-> 36->53->10->51->0 true length=128 1->6->31->28->13->66->75->120->89->62->55->20->101 ->122->99->112->49->118->79->12->61->50->123->104-> 9->46->103->4->21->106->19->96->97->102->127->124-> 109->34->43->88->57->30->23->116->69->90->67->80-> 17->86->47->108->29->18->91->72->105->14->71->100-> 117->74->115->64->65->70->95->92->77->2->11->56-> 25->126->119->84->37->58->35->48->113->54->15->76-> 125->114->59->40->73->110->39->68->85->42->83->32-> 33->38->63->60->45->98->107->24->121->94->87->52-> 5->26->3->16->81->22->111->44->93->82->27->8->41-> 78->7->36->53->10->51->0 true length=256 1->6->31->156->13->66->75->120->89->190->183->148-> 229->122->99->240->177->118->79->140->189->178->123-> 104->9->46->231->132->149->234->147->224->97->230-> 127->124->109->34->171->88->185->158->23->116->69-> 90->195->208->17->86->175->108->29->146->219->72->105-> 14->71->100->245->202->243->192->193->198->223->92-> 205->2->11->56->25->126->119->84->165->58->35->176-> 113->54->15->76->125->114->59->40->201->238->167-> 68->85->170->83->160->33->166->63->60->45->226->107-> 24->121->94->215->52->5->26->131->144->209->22->111-> 44->221->82->155->8->41->206->7->36->181->138->179-> 128->129->134->159->28->141->194->203->248->217->62-> 55->20->101->250->227->112->49->246->207->12->61->50-> 251->232->137->174->103->4->21->106->19->96->225->102-> 255->252->237->162->43->216->57->30->151->244->197-> 218->67->80->145->214->47->236->157->18->91->200-> 233->142->199->228->117->74->115->64->65->70->95->220-> 77->130->139->184->153->254->247->212->37->186->163-> 48->241->182->143->204->253->242->187->168->73->110-> 39->196->213->42->211->32->161->38->191->188->173->98-> 235->152->249->222->87->180->133->154->3->16->81->150-> 239->172->93->210->27->136->169->78->135->164->53->10-> 51->0 true length=512 true length=1024 true length=2048 true length=4096 true length=8192 true length=16384 true length=32768 true length=65536 true length=131072 true length=262144 true length=524288 true length=1048576 true  #### ■_ Coq とか Agda とか見えたので読んでみようと思ったものの λ Tony's blog λ ≫ Blog Archive ≫ A brief point on static typing A brief point on static typing This post also incidentally attempts to justify my answer to a question that I regularly encounter, what is your favourite programming? It's not possible to write bug-free programs. (バグのないプログラムを書くことは不可能である。) This particular myth is quite popular, and its recent inclusion in a discussion I had has prompted me to write this post. The profound fact of this matter is that this statement is “as false as you can get.” In other words, there exists a formal proof of its negation, and subsequent counter-examples. It has also been my experience that belief in this myth has degenerative practical implications. この particular myth はとてもよく知られていて、 最近わたしが行った議論にあったものはわたしにこのpostを書かせる原動力となりました。 The profound fact of this matter is that this statement is “as false as you can get.” 言い換えれば there exists a formal proof of its negation, and subsequent counter-examples. It has also been my experience that belief in this myth has degenerative practical implications. To emphasise the point, there exist programming languages for which it is impossible to write an incorrect program. What does it mean for a program to be correct? It means the program terminates. Such languages include (in order of my decreasing experience with them): Coq, Agda, Epigram, Isabelle. ここで強調すべきは、正しくないプログラム (incorrect program) を書くのが不可能な プログラミング言語が存在しているということです。 プログラムが正しい状態にあるとはどういうことでしょうか? それはそのプログラムが終了するということです。 正しくないプログラムを書くことのできないそのような言語には Coq、Agda、Epigram、Isabelle といったものがあります。 略  ### 2011年03月28日 #### ■_ 調子が悪いときはとっとと寝る。 #### ■_ Scala と Javaのアンチパターン NfNitLoop: Scala vs. a Java Anti-Pattern Scala vs. a Java Anti-Pattern Scala と Java のアンチパターン In the last few months I've been playing with Scala, a new(ish) programming language that runs on the JVM. The problem with learning a fun new language is that you start wanting to use its features, even when you're still using a different language. (In this case, Java.) But I came across a case this past week that illustrated this so well that I actually wanted to write it down to share. ここ数ヶ月の間、わたしはJVM上で動作する新しい(っぽい)プログラミング言語である Scala を試していました。楽しい新言語を学ぶときの問題は、まだ異なる言語 (この場合は Java) を使っている場合にさえその機能を使いたくなってしまうということです。 But I came across a case this past week that illustrated this so well that I actually wanted to write it down to share. The Anti-pattern アンチパターン It's common to need to return more than one item. Maybe a name and date, or an ID and a status, or a result and an error code. Languages like Java don't really make this easy -- the return statement only takes one argument. 二つ以上のアイテムを返す必要性は一般的なものです。 それはおそらく、名前と日付、ID とステータス、結果とエラーコードといったものであるでしょう。 Java のような言語は複数の値を返すことが容易ではありません。 リターン文はただひとつの引数しかとらないからです。 This often gets solved by returning some type that approximates a tuple. For example: これはほとんどの場合、タプルのような型を返すことで解決されます。 たとえば // ... TuplePair<Integer,ErrorType> pair = foo.computeSomeValue(); if (checkErrorConditionOK(pair.getSecond())) return pair.getFirst(); else throw new AppropriatelyNamedExceptionTypeException(pair.getSecond().getErrorMessage()); // ... This is pretty ugly. But foo has returned a couple values to us. And with Java generics, we even get some compile-time type checking -- pair.getFirst() is an Integer, and pair.getSecond() is an ErrorType. But that's a pretty ideal case with a nice descriptive type parameter. What if our parameters are of the same type? これはとても醜いものですが、とにかく foo は二つの値を返します。 そして Java のジェネリクスを一緒に使うことでコンパイル時の型チェックさえ手に入れられます。 pair.getFirst() は Integer であり、pair.getSecond() は ErrorType です。 しかしこの場合に理想的なのは descriptive な型パラメータです。 // ... TuplePair<Integer,Integer> pair = foo.computeSomeValue(); if (checkErrorConditionOK(pair.getSecond())) return pair.getFirst(); else throw new AppropriatelyNamedExceptionTypeException("Got error code: " + pair.getSecond()); // ... From this code, I have no way of knowing that the second item in the tuple is an error code. I can infer it from the way the code is written. But maybe whoever wrote this code got the parameters mixed up. If I'm reading through this code (debugging or doing a thorough code review), I now have to open up foo's implementation to see what actually gets returned in that tuple. このコードからはわたしがタプルの二番目の要素がエラーコードであるということを知る手段は ありません。わたしはそのコードが書かれた手順からそれがエラーコードであることを推測でき ます。 But maybe whoever wrote this code got the parameters mixed up. 仮にわたしがこのコードを(デバッグやコードレビューなどで)読んだとすると、 foo の実装を開いてタプルで返しているものが実際にはなんであるのかを 確認しなければならないでしょう。 In this case, our TuplePair is barely better than an array. この場合、TuplePair は配列よりもずっと好ましいものです。 Why the Anti-pattern Persists なぜアンチパターンが持続するのか It's easy to understand why I keep seeing this pattern in Java, though. There's a lot of boring boilerplate code that has to be written to work around this problem. To make the code more meaningful and readable, and to allow for easier refactoring in the future, you should assign a name to each thing you want to return. (No, you don't want to do that with a Map. That's an anti-pattern for another blog post.) The best way to do this is with a new class. とはいえなぜこのパターンを Java で目撃し続けているのかを理解するのは容易です。 There's a lot of boring boilerplate code that has to be written to work around this problem. この問題を回避するために書かなければならない退屈な boilerplate code があまりにもたくさ んあります。このコードをより meaningful かつ読みやすいものにするために、また、将来にお けるリファクタリングを容易なものにするために、返したいと考えているものそれぞれについて 名前を割り当てるべきなのです (No, you don't want to do that with a Map. That's an anti-pattern for another blog post.)。 これを行うベストの手段は新しいクラスを使うことです。 In Java, writing a new class is pretty heavyweight. If you want to use proper encapsulation, your new class might look like this: Java では新しいクラスを書くのはとても大ごと (heavyweight) です。 もし適切な encapsulation を使いたいのなら、新しいクラスは次のようになるでしょう: public class ComputedValue { private int value; private int errorCode; public ComputedValue(int value, int errorCode) { this.value = value; this.errorCode = errorCode; } public getValue() { return value; } public getErrorCode() { return errorCode; } } Since ComputedValue is a public class, you have the option of making it a static inner class, or going the more common route and creating an entirely new file. But at least now our code is a bit more readable: ComputedValue はパブリックなクラスなので、これを static な inner クラスにするか より一般的なルートを行って完全に新規のファイルを作成するかの選択肢があります。 しかし少なくとも現在わたしたちのコードは多少読みやすいものです: // ... ComputedValue computed = foo.computeSomeValue(); if (checkErrorConditionOK(computed.getErrorCode())) return computed.getValue(); else throw new AppropriatelyNamedExceptionTypeException("Got error code: " + computed.getErrorCode()) // ... What Scala Offers Scala が offer するもの Given the above option, it's not surprising that people keep creating the TuplePair class. Frankly, it's a pain in the ass to write new Java classes for every case like this. The thinking is “Write TuplePair once and re-use it so we don't have to do it again,” even though it leads to useless names like getFirst() and getSecond(). 上記の選択肢を与えられたときに、Tupler クラスを作り続ける人がいることは 不思議なことではありません。 率直に言って、このような自体の度ごとにあたらしいJavaのクラスを記述することは 苦痛を伴うものです。その思考は “Write TuplePair once and re-use it so we don't have to do it again,” even though it leads to useless names like getFirst() and getSecond(). Scala helps solve this class of problems by making creating these types of classes dead simple: Scala はこういったクラスの型の作成をとことんシンプルにすることで この問題のクラスの解決を手助けします。 class ComputedValue(val value: Int, val errorCode: Int) This single line gives you everything that the above Java implementation did: A type with “value” and “errorCode” properties, with a constructor for easily creating new instances. And while you didn't have to write any boilerplate code to provide encapsulation, it's there, implemented automatically by the Scala compiler. この一行は前述の Java での実装が行っているものすべてを与えています: “value”と“errorCode”というプロパティーと 新しいインスタンスを生成できるコンストラクターを持った型です。 And while you didn't have to write any boilerplate code to provide encapsulation, it's there, implemented automatically by the Scala compiler. Classes in Scala are public by default, and they don't have to be declared in separate files or as static inner classes. That makes the barrier to entry to writing better code a single line in a file you already have open. The Scala version of the above snippet becomes: Scala におけるクラスはデフォルトで public であり、別のファイルで宣言したり static な inner class とする必要はありません。そういったものは既にオープンしているファイルで 一行の better code を書く妨げになります。 先のコード片の Scala版はこうなります: // ... val computed = foo.computeSomeValue(); if (checkErrorConditionOK(computed.errorCode)) return computed.value; else throw new AppropriatelyNamedExceptionTypeException("Got error code: " + computed.errorCode) // ... But Scala gives you even one more nice feature in this simple case. So far, we've only been looking at code that consumes a ComputedValue instance. But what about the code that creates it? しかし Scala はこの単純なケースにおいて、さらにもう一つniceな機能を与えています。 So far, we've only been looking at code that consumes a ComputedValue instance. しかし生成されたコードはどうでしょうか? // Java: public ComputedValue computeValue() { return new ComputedValue(0, 42); } Even with our new ComputedValue class, this code still has the same problem as our TuplePair! Someone reading this code can't tell which is the value and which is the error code. To be sure, we would have to open up ComputedValue and look at its constructor parameters. But we're still not sure if the developer knew the proper parameter order when he wrote that line of code. We can't see his intention from the information in the call to the constructor. I use the following pattern to make this more clear in Java: 新しい ComputedValue クラスを使ってさえ、このコードにはまだ TuplePair と同じ問題があります! Someone reading this code can't tell which is the value and which is the error code. 誰かがこのコードを読んでも、どちらが value でどちらがエラーコードなのか区別できません。 To be sure, ComputedValue を open してそのコンストラクターに対するパラメーターを確認しなければなりません。 But we're still not sure if the developer knew the proper parameter order when he wrote that line of code. コンストラクターの呼び出しから得られる情報からは、コードの書き手の意図を読み取れません。 Java においてこれをより明確にするためにわたしは次のパターンを使っています: public ComputedValue computeValue() { final int value = 0; final int errorCode = 42; return new ComputedValue(value, errorCode); } The above at least makes my intentions clear. But to check whether my argument order matches the argument order expected by the constructor, you'll have to look at ComputedValue's implementation. How does Scala improve on this? 上記のものは少なくともわたしの意図 intentions を明確にしています。しかし引数の並びがコンス トラクターが予期しているものとマッチしているかどうかを確認するには、ComputedValue の実 装を確認必要があるでしょう。Scala はこれをどのように改良しているのでしょうか? def computeValue() = new ComputedValue(value = 0, errorCode = 42) Again, a single line of Scala replaces several lines of Java. The code here is not only more concise, but more explicit. The return type of computeValue() is inferred by the result of the method's implementation on the right-hand side. And by using named arguments, we not only make our intention clear to the reader, but also to the compiler. Scala will check at compile-time that ComputedValue's constructor takes “value” and “errorCode” parameters, and will even re-order our paramters for us if we got them in the wrong order. 繰り返しますが、Scala の一行は Java の複数行を置き換えます。ここで挙げたコードは一貫性 があるだけでなくより explicit でもあります。computeValue() の戻り値の型は、その右辺に あるメソッドの実装の結果に影響を受けます。また、名前つき引数を使うことでリーダーに対し て意図を明確にするだけでなくコンパイラーに対しても意図を明確にしています。Scala はコン パイル時に ComputedValue のコンストラクターが“value”と“errorCode”のパラメーターを 取ることをチェックしていて、加えて間違った順序でそれらを渡したときにパラメーターの re-order さえするでしょう。 These are some of the most basic features that Scala offers, and yet they're the ones that come up in day-to-day work over and over again. By getting tedious boilerplate out of the way, Scala lets coders write better code without worrying about making more work for themselves. Scala が提案する最も基本的な機能がいくつかあり、 Posted by Cody Casterline at 6:56 PM  #### ■_ んー、このくらいならという気もするけど染まっちゃってるせいですかね ＞strcpy Code 2 Read » Blog Archive » Writing Code (to Read) Writing Code (to Read) (読むために) コードを書く I wanted to start my blog with a post about strcpy implementation in C. This is an oversimplified example of the underlying point I want to make with my blog, but an example I wanted to show nonetheless. I first happened upon the differing possible implementations of strcpy in an article about C that I found on reddit. It had been a few months since I'd written any C code and I was preparing for job interviews for an internship. I wanted to brush up on C and I noticed the article. The article was phenomenal and really helped refresh my memory. I can't seem to find the article; if anyone happens to have a link please share it in the comments. Here's an implementation of strcpy: 以下の例は strcpyの実装です: static void strcpy(char destination[], const char source[]) { while(*destination++ = *source++); }  #### ■_ 3. Free Java tests are a kind of “blacklisting” approach to governing implementations. Had we not added testOneIsNotZero, we could have returned PlusZero and the compiler would have been happy. In a free category, relations are “whitelisted:” an implementation must not have relations (constraints) other than the ones that are implied by the definitions. Java のテストは実装を governing するための “blacklisting” アプローチです。 Had we not added testOneIsNotZero, we could have returned PlusZero and the compiler would have been happy. free category では relations は “whitelisted:”です。 実装は定義により implied される一つを除いて relations (constraitnts) を持ってはいけません。 * “The free category” has no objects and no morphisms, because there aren't any specified. free category はオブジェクトも morphisms も持ちません。 なぜなら、一切 specified されていないからです。 * “The free category on one object N” has one object and one morphism, the identity 1_N. The morphism has to be there, because the definition of category says that every object has to have an identity morphism, but we can't put in any other morphisms. “The free category on one object N”は一つのオブジェクトと一つの morphism を持ちます。 その morphism は存在していなければなりません。なぜなら、他の morphisms を put することができないからです。 * “The free category on the directed graph G” 有効グラフ G 上の free category は A -f-> B | / G = h g | / V L C has o three objects A, B, C and 三つのオブジェクトA、B、C および o seven morphisms: + 1_A:A \to A, + 1_B:B \to B, + 1_C:C \to C, + f:A \to B, + g:B \to C, + h:A \to C, and + (g \circ f):A \to C. 以上七つの morphisms を持っています。 The three vertices and the three edges become objects and morphisms, respectively. The three identity morphisms and (g \circ f) are required to be there because of the definition of a category. And because the category is free, we know that (g \circ f) \ne h. 三つの vertices と三つの edges は対応したオブジェクトと morphisms になります。 この三つの identity morphsms と (g \circ f) は圏の定義によって要求されます。 またこの圏は free であるので、(g \circ f) \ne h であることがわかります。 For abritrary directed graphs, the free category on the graph has the vertices of the graph as objects and the paths in the graph as morphisms. 任意の有効グラフのために、そのグラフ上の free category はオブジェクトとしての vertices of the graph と morphisms としての paths in the graph を持っています。 * The parity monoid is completely defined by “the free category on one object N, one morphism \mbox{PlusOne}:N\to N, and one relation \mbox{PlusOne} \circ \mbox{PlusOne} = 1_{N}.” * If we leave out the relation and consider “the free category on one object N and one morphism \mbox{PlusOne}:N\toN,” then we get the natural numbers under addition: since we didn't specify a point at which repeated applications of PlusOne “wrap around” back to zero, the free category has no such constraint. To add the number n, we form \underbrace{\mbox{PlusOne} \circ \mbox{PlusOne} \circ \cdots \circ \mbox{PlusOne}}_{n \mbox{ copies}}. * Exercise: what is the common name for “the free category on one object N and two morphisms \mbox{ConcatenateZero}:N\to N, \mbox{ConcatenateOne}:N\to N?” What should the identity morphism be named? 問題 common name はなにか  #### ■_ ### 2011年03月27日 #### ■_ 日曜日夜のこの時間がない感ときたら。 積読解消。 プロジェクトのリーダーの人もそうでない人にもオススメ。 100点満点、減点評価を止めようというのは同意しますが、 青天井加点評価へどう移行するのか気にはなったり (「成果主義」の二の舞になりそうな気が)。 積読のまま行方不明になっていたのを再発見したので読んだ。 「教えすぎてはいけない」というのはそうなんでしょうね。 とはいえ、教わるほうも雛鳥がえさをねだるごとく教えてくれくれいう人が多い気もします。 CPython 3.2 ソースコードリーディング - [PARTAKE] に参加してきました。 いろいろメモ書きやらしたのですが時間ががが。 とりあえず一言。 門番?の警備員さん怖かった。 OS-9 (MacOSぢゃないよ)ってソースコードは今も読めないのだっけ。 CP/Mは公開されたような覚えがあるんだけど。 OS-9 - Wikipedia CP/M - Wikipedia #### ■_ マ板から Rubyist とか Pythonista とか聞くけど、ほかの言語ではどういう呼称が使われてるんだろう。 あ、英語圏で。なんでもかんでも ～er ってこたないと思うのだけど。 スレッドを立てるまでもない質問雑談スレ42 489 仕様書無しさん [sage] 2011/03/22(火) 13:43:24.37 ID: Be: Ruby使いはRubyist、Cobol使いはCOBOLer、じゃあC使いは何なの？ 497 仕様書無しさん [] 2011/03/25(金) 21:55:15.17 ID: Be: ブチョーがコードレビューするのだが どーせ隅から隅まで見るわけねえだろ と思って 一箇所にだけ unsigned int auto ってやったんだわ したらブチョーが「おい このautoなんだ」 見とったわ ブチョー隅まで見てたわ ビビッた 498 仕様書無しさん [sage] 2011/03/26(土) 06:32:24.64 ID: Be: 薄目で全体を見てるんだよキット 499 仕様書無しさん [sage] 2011/03/27(日) 02:10:18.67 ID: Be: >>490 純粋なSE会社はコード書かないよ。客から話を聞だして文書作るのが仕事。 コミュニケーションと技術的に可否の判断が必須、コーディングの知識なんてクソも役に立たんよ。 某JRシ○テムはそうだった。気をつけて。 >>497 もうすぐC++じゃautoさんが生まれ変わっちゃうけどデフォルトでstaticな環境ってまだ存在すんのかな。  #### ■_ Category Theory for the Java Programmer ぼちぼちと。 Category Theory for the Java Programmer « reperiendi Category Theory for the Java Programmer (Javaプログラマーのための圏論) (略) 1. Objects (オブジェクト) Both in Java and in category theory, objects are members of a collection. In Java, it is the collection of values associated to a class. For example, the class Integer may take values from -2^{63} up to 2^{63}-1. The class Java と圏論の両方でオブジェクトは collection のメンバーです。 Java においてはオブジェクトはあるクラスに結び付けられた値の collection です。 たとえば Integerクラスは -2^{63} から 2^{63}-1 までの値を取ります。 enum Seasons { Spring, Summer, Fall, Winter } has four possible values. というクラスは四つの possible values を持っています。 In category theory, the collection of objects is “half” of a category. The other part of the category is a collection of processes, called “morphisms,” that go between values; morphisms are the possible ways to get from one value to another. The important thing about morphisms is that they can be composed: we can follow a process from an initial value to an intermediate value, and then follow a second process to a final value, and consider the two processes together as a single composite process. 圏論ではオブジェクトの collection は “half” of a category (圏の“半分”?) です。 The other part of the category is a collection of processes, called “morphisms,” 圏のもう半分は“morphisms”と呼ばれる value 間の process の collection です。 morphisms are the possible ways to get from one value to another. morphisms で重要なことはそれが compose (集める) できるということです。 initial value から intermediate value への process 、そしてそこから final value への second process を follow することが可能であり、 この二つの process をまとめてひとつの composite process とみなせるのです。 2. Categories (圏) Formally, a category is 形式的には圏とは * a collection of objects, and オブジェクトの集まりと * for each pair of objects A, B, a set {\rm hom}(A, B) of morphisms between them オブジェクト A、B のペアごとの間にある morphisms {\rm hom}(A, B) の集合です。 #にほんごおかしい○|￣|＿ such that * for each object X the set {\rm hom}(X, X) contains an identity morphism 1_X, * for each triple of objects X, Y, Z and pair of morphisms f \in {\rm hom}(X, Y) and g \in {\rm hom}(Y, Z), there is a composite morphism (g \circ f) \in {\rm hom}(X, Z), * for each pair of objects X, Y and morphism f\in {\rm hom}(X, Y), the identitiy morphisms are left and right units for composition: 1_Y \circ f = f = f \circ 1_X, and * for each 4-tuple of objects X, Y, Z, W and triple of morphisms f \in {\rm hom}(X, Y), g \in {\rm hom}(Y, Z), h \in {\rm hom}(Z, W) composition is associative: h\circ (g \circ f) = (h \circ g)\circ f. If a morphism f \in {\rm hom}(A,B), category theorists write f:A\to B. ある morphism f が {\rm hom}(A,B) にあった場合、圏論ではそれを f:A\to B のように書きます。 We can also define a category in Java. A category is any implementation of the following interface: Javaでも圏を定義できます。 ある圏は以下のインターフェースの実装となります: interface Category { interface Object {} interface Morphism {} class IllegalCompositionError extends Error; Object source(Morphism); Object target(Morphism); Morphism identity(Object); Morphism compose(Morphism, Morphism) throws IllegalCompositionError; }; that passes the following tests on all inputs: このインターフェースはすべての入力に対して以下のテストを pass します: void testComposition(Morphism g, Morphism f) { if (target(f) != source(g)) { assertRaises(compose(g, f), IllegalCompositionError); } else { assertEqual(source(compose(g, f)), source(f)); assertEqual(target(compose(g, f)), target(g)); } } void testAssociativity(Morphism h, Morphism g, Morphism f) { assertEqual(compose(h, compose(g, f)), compose(compose(h, g), f)); } void testIdentity(Object x) { assertEqual(source(identity(x)), x); assertEqual(target(identity(x)), x); } void testUnits(Morphism f) { assertEqual(f, compose(identity(target(f)), f)); assertEqual(f, compose(f, identity(source(f)))); } One kind of category that programmers use every day is a monoid. Monoids are sets equipped with a multiplication table that's associative and has left- and right units. Monoids include everything that's typically called addition, multiplication, or concatenation. Adding integers, multiplying matrices, concatenating strings, and composing functions from a set to itself are all examples of monoids. プログラマーが日々使っている圏の一種にモノイドがあります。 モノイドとは sets equipped with a multiplication table that's associative and has left- and right units です。 モノイドは典型的には加算、乗算、連結とよばれるようなものすべてを含みます。 a set to itself からの整数の加算、行列の乗算、文字列の連結、関数の合成 はすべてモノイドの例です。 In the language of category theory, a monoid is a one-object category (monos is Greek for “one”). The set of elements you can “multiply” is the set of morphisms from that object to itself. 圏論の用語では、あるモノイドは one-object category です( mono とはギリシア語での “one”)。 乗算可能な要素の集合は object to it self からなる morphisms の集合です。 In Java, a monoid is any implementation of the Category interface that also passes this test on all inputs: Java ではモノイドはすべての入力について以下のテストも pass する Categroy インターフェースの任意の実装です。 void testOneObject(Morphism f, Object x) { assertEqual(source(f), x); } The testOneObject test says that given any morphism f and any object x, the source of the morphism has to be that object: given another object y, the test says that source(f) == x and source(f) == y, so x == y. Therefore, if the category passes this test, it has only one object. testOneObject テストでは 与えられた morphism f と任意のオブジェクト x、その morphism の source が そういったオブジェクトでなければならないとしています。 別のオブジェクト y が与えられたならば、このテストは source(f) == x かつ source(f) == y でありしたがって x == y であると主張しています。 したがって、もしその圏がこのテストを pass したなら、その圏はただひとつだけの オブジェクトを持っています。 For example, consider the monoid of XORing bits together, also known as addition modulo 2. Adding zero is the identity morphism for the unique object; we need a different morphism to add 1: 例としてaddition modulo 2 としても知られている XORing bits together のモノイドを考えてみましょう。 ゼロを加えることは unique なオブジェクトに対する morphism の identity であり、 1を加えるには異なる morphism が必要です。 interface Parity implements Category { Morphism PlusOne(); } The existence of the PlusOne method says that there has to be a distinguished morphism, which could potentially be different from the identity morphism. The interface itself can't say how that morphism should behave, however. We need tests for that. The testPlusOne test says that (1 + 1) % 2 == 0. The testOneIsNotZero test makes sure that we don't just set 1 == 0: since (0 + 0) % 2 == 0, the first test isn't enough to catch this case. PlusOne メソッドの存在は区別された morphism が存在してなければならないことを 主張しています。 その morphism は identity morphism とは異なる可能性があります。 インターフェースそれ自身はその morphism がどのように振舞うべきかについて言及していません。 ですからそのためのテストが必要です。 testPlusOne テストは (1 + 1) % 2 == 0 であることを主張しています。 testOneIsNotZero テストは 1 == 0 としないことを保証します。 (0 + 0) % 2 == 0 であるので、最初のテストはこのケースを catch するには不十分です。 void testOnePlusOne(Object x) { assertEqual(compose(PlusOne(), PlusOne()), identity(x)); } void testOneIsNotZero(Object x) { assertNotEqual(PlusOne(), identity(x)); } Here's one implementation of the Parity interface that passes all of the tests for Category, testOneObject, and the two Parity tests: 以下の例は Parity インターフェースの実装で、 圏、testOneObject に対するテストのすべてと 二つの Parity テストを渡します: class ParityImpl implements Parity { enum ObjectImpl implements Object { N }; enum MorphismImpl implements Morphism { PlusZero, PlusOne }; Object source(Morphism) { return N; } Object target(Morphism) { return N; } Morphism identity(Object) { return PlusZero; } Morphism compose(Morphism f, Morphism g) { if (f == PlusZero) return g; // 0 + g = g if (g == PlusZero) return f; // f + 0 = f return PlusZero; // 1 + 1 = 0 } Morphism PlusOne() { return PlusOne; } }  つづく。 わかったようなわからないような。 #### ■_ げしょ。 ### 2011年03月26日 #### ■_ ・ダムエー ガンダム創世が最終回。これ、ガンダムさんの単行本に一緒に収録されてたのね。 ガンダムさんは買ってなかったから気がつなかんだ。 オリジンはまあそのなんだ。 を見かけたけど、前のとの違いはとりあえずグラフ理論の章が増えたことみたい (分量的にはそれほどでも)。スルーでいいかな。 一緒に並んでいた は、入門には割といいんじゃなかろうかという感じ。 全ページカラーで図解いりだし。 #### ■_ Javaプログラマーのための圏論 良くみたら2007年11月の記事か。 古いとダメということはないけど今これが再発見されたんだろうか Category Theory for the Java Programmer « reperiendi Category Theory for the Java Programmer (Javaプログラマーのための圏論) Posted in Category theory, Math, Programming by reperiendi on 2007 November 3 There are several good introductions to category theory, each written for a different audience. However, I have never seen one aimed at someone trained as a programmer rather than as a computer scientist or as a mathematician. There are programming languages that have been designed with category theory in mind, such as Haskell, OCaml, and others; however, they are not typically taught in undergraduate programming courses. Java, on the other hand, is often used as an introductory language; while it was not designed with category theory in mind, there is a lot of category theory that passes over directly. いくつかの優れた、圏論への introduction があって、それぞれ異なる読者に向けて 書かれています。しかし、わたしはこれまでコンピューター科学者でも数学者でもない、 プログラマーに向けて書かれたものを見たことがありません。Haskell や OCmal のように圏論に関係深く設計されている言語はいくつかありますが、そういった 言語は大学のプログラミングコースで教えられるようなものであるのが典型です。 一方で、Java は introductory language として頻繁に使われています。 Java は圏論を念頭において設計された言語ではありませんが、 I'll start with a sentence that says exactly what the relation is of category theory to Java programming; however, it's loaded with category theory jargon, so I'll need to explain each part. A collection of Java interfaces is the free cartesian category with equalizers on the interface objects1 and the built-in objects. 1. Objects 以下略  #### ■_ Language of the Month: Mirah Dr Dobbs ジャーナルには今月の言語 (Language of the Month) なんてのがあるのか。 Dr Dobbs | Language of the Month: Mirah What happens when you take Ruby syntax, add static typing, and run it on the JVM? Charles Nutter, the designer of JRuby, shows us Ruby の構文を持ってきて、静的な型付けを加えてそれをJVM上で実行したら 何がおきるでしょうか? JRuby の desinger である Charles Nutter はわたしたちに それを見せてくれました。 The JVM is an interesting place these days. We'll have Java 7 soon, with support for dynamic invocation and improved APIs. We have a many programming languages to choose from — some statically typed, some dynamically typed, some blurring the lines — that offer their own unique benefits. We're even getting a few "small changes" to Java itself, like literal lists and Strings-in-switch. We have the largest and most robust ecosystem of any managed runtime, with hundreds of conferences, tens of thousands of libraries, and millions of developers. 今日において JVM は interesting な場所です。 まもなくわたしたちは dynamic invoation と改良されたAPI をサポートする Java 7 を手にします。わたしたちにはそれぞれが選択肢として unique な利点を offer する 多くのプログラミング言語があります。 そういった選択肢には静的型付けや動的型付け、blurring the lines といったものがあります。 Java それ自身にはリテラルリストや文字列による switch のような 少数の“小さな変更”しかありません。 わたしたちには何百という conferences を持ち、数千のライブラリを有し、そして 何百万という開発者を持った最も大きく最も頑強なmanaged runtime のエコシステムがあります。 But there's something missing. No language seems likely to replace Java itself. しかし何かが足りないのです。 Javaそのものを置き換えてしまう言語はないように思われます。 Or at least, none of the alternatives fits what I need out of a Java replacement. 少なくとも、Javaを置き換えるのにわたしが必要だと思っていることに fit する alternatives はありません。 Learning from Java Java から学ぶ (略) The Mirah Programming Language Mirah grew out of my desire to have a language for implementing JRuby that would be more approachable to non-Java developers. Let's face it, Java's not too hard to learn, but there are lots of details that take time to internalize. It's not a complicated language, but it can scare the uninitiated. With Mirah, I wanted to make a language that fit my criteria, so that I and others would have the Java replacement we've been wanting. (略)  #### ■_ Java は死なず ちょっと前にあったやつに対する反論らしい ＞ Rise and Fall and Rise Not Dead Yet: The Rise and Fall and Rise of Java – tecosystems Not Dead Yet: The Rise and Fall and Rise of Java FOSDEM In November of last year, Forrester analyst Mike Gualtieri published the provocatively titled “Java Is A Dead-End For Enterprise App Development.” This January, the firm's John Rymer followed up with a more balanced but similarly pessimistic “The Future of Java.” Collectively, these may be considered representative of the conventional wisdom of the enterprise. 昨年十一月、Forrester のアナリスト Mike Gualtieri は “Java Is A Dead-End For Enterprise App Development” という provcatively なタイトルのついたものを公開しました。 今年一月にはその firm の John Rvmer が、よりバランスの取れたしかしやはり悲観的な フォローアップ記事 “The Future of Java” を公開しました。 The reality that we at RedMonk observe differs substantially, as might be predicted given our sources. What you think depends on who you listen to, remember. Forrester, as we have discussed previously [coverage], primarily concerns itself with enterprise software buyers and “platform software decision-makers.” RedMonk, meanwhile, is oriented around practitioners; practitioners like those attending FOSDEM, for example. We advantage this audience simply because we believe that bottom up adoption is more predictive of technology direction than top down procurement, but reasonable minds may obviously disagree. (以下略)  誰か全文を(ry #### ■_ Reviewing Perl Best Practices, Chapter 15, Objects - House Absolute(ly Pointless) Reviewing Perl Best Practices, Chapter 15, Objects By Dave Rolsky on March 23, 2011 10:40 AM Perl Best Practices (PBP) was released in 2004, about 7 years ago. Perl is a great language, but its culture of TIMTOWTDI can be a problem. These days, we often hear that "there's more than one way to do it, but sometimes consistency is not a bad thing either" (TIMTOWTDIBSCINABTE). PBP deserves a lot of credit for encouraging discussion about the downsides of TIMTOWTDI. Perl Best Practices (PBP) はほぼ7年前、2004年にリリースされたものです。 Perl は great な言語ではありますが、TIMTOWTDI の文化は問題となる可能性を 孕んでいます。今日、わたしたちは "there's more than one way to do it, but sometimes consistency is not a bad thing either" (TIMTOWTDIBSCINABTE). ということを頻繁に耳にします。 PBP は TIMTOWTDI の downsides についての PBP has a lot of recommendations. A lot of them are fairly trivial, like recommendations on code formatting. I'm a big fan of using a consistent style throughout a single code base, but I'm flexible about what that style should be. Some of his other recommendations are more meaningful, and deserve some review 7 years later. PBP は多くの recommendations があります。 その多くは コードのフォーマットの推奨のように trivial なものです。 わたしは単一のコードベースを通じて一貫したスタイルを使っていることがとても好きです。 とはいえ、わたしはあるべきスタイルについては flexible です。 彼のそのほかの recommendations の一部はより meaningful なものであり、 7年たった今でもいくつかレビューを受けています。 I'll take a look at each recommendation in Chapter 15, Objects, and see how it's stood up in the era of Modern Perl. Maybe I'll tackle some other chapters in the future. (略) Conclusion This chapter of PBP holds up pretty well 7 years later. He was out of sync with the community on accessor naming in 2004, and that hasn't changed. I disagree with him on overloading, but I don't know that there's any community agreement on that. Other chapters talk more about inside-out objects, and I think he missed the mark in thinking those were the future of Perl OO. I'll cover that in more detail if I review one of those chapters. His remaining recommendations remain pretty solid in 2011.  #### ■_ ### 2011年03月25日 #### ■_ 買った あとがきが重いっすよ… ○|￣|＿ モーニング掲載時から改変されたエピソードがあるとか→ 【乗客専務】カレチ【車掌】 ・アフタヌーン 主役が登場したのどれだけぶりだろう＞むげにん ・ダムエー 26日発売掲載ので最終回? ＞オリジン #### ■_ C/C++プログラミングの「迷信」と「誤解」 フラゲして買ってたんですが、いろいろ割り込みがあったりなかったり。 正直重箱の隅過ぎないかというのもあったりしましたが、読んでて面白かったです。 一番のお気に入り(?) は 「scanfはバッファオーバーランを防げない?」かな。 意地悪な入力パターンを考えると結構手間なんですよね。＞ fgets + sscanf とか でもこの本で一番いいなと思ったのは、ところどころ挟まるこういう漫画や挿絵だったりして (高木さんごめんなさい)。 #### ■_ 便利だけどあまり知られていないデータ構造 回答編。リンクまではむり ○|￣|＿ 知らないのが結構あるなあ。 以下適当に抜粋 language agnostic - What are the lesser known but cool data structures ? - Stack Overflow There are some data structures around that are really cool but are unknown to most programmers. Which ones are they? Everybody knows about linked lists, binary trees, and hashes, but what about Skip lists and Bloom filters for example. I would like to know more data structures that are not so common, but are worth knowing because they rely on great ideas and enrich a programmer's tool box. PS: I am also interested in techniques like Dancing links which make clever use of properties of a common data structure. EDIT: Please try to include links to pages describing the data structures in more detail. Also, try to add a couple of words on why a data structures is cool (as Jonas Kölker already pointed out). Also, try to provide one data-structure per answer. This will allow the better data structures to float to the top based on their votes alone. language-agnostic data-structures computer-science self-improvement  Rope: It's a string that allows for cheap prepends, substrings, middle insertions and appends. I've really only had use for it once, but no other structure would have sufficed. Regular strings and arrays prepends were just far too expensive for what we needed to do, and reversing everthing was out of the question.  Here are a few: * Suffix tries. Useful for almost all kinds of string searching (http://en.wikipedia.org/wiki/Suffix_trie#Functionality). See also suffix arrays; they're not quite as fast as suffix trees, but a whole lot smaller. * Splay trees (as mentioned above). The reason they are cool is threefold: o They are small: you only need the left and right pointers like you do in any binary tree (no node-color or size information needs to be stored) o They are (comparatively) very easy to implement o They offer optimal amortized complexity for a whole host of "measurement criteria" (log n lookup time being the one everybody knows). See http://en.wikipedia.org/wiki/Splay_tree#Performance_theorems * Heap-ordered search trees: you store a bunch of (key, prio) pairs in a tree, such that it's a search tree with respect to the keys, and heap-ordered with respect to the priorities. One can show that such a tree has a unique shape (and it's not always fully packed up-and-to-the-left). With random priorities, it gives you expected O(log n) search time, IIRC. * A niche one is adjacency lists for undirected planar graphs with O(1) neighbour queries. This is not so much a data structure as a particular way to organize an existing data structure. Here's how you do it: every planar graph has a node with degree at most 6. Pick such a node, put its neighbors in its neighbor list, remove it from the graph, and recurse until the graph is empty. When given a pair (u, v), look for u in v's neighbor list and for v in u's neighbor list. Both have size at most 6, so this is O(1). By the above algorithm, if u and v are neighbors, you won't have both u in v's list and v in u's list. If you need this, just add each node's missing neighbors to that node's neighbor list, but store how much of the neighbor list you need to look through for fast lookup.  I'm surprised no one has mentioned Merkle trees (ie. Hash Trees). Used in many cases (P2P programs, digital signatures) where you want to verify the hash of a whole file when you only have part of the file available to you.   * Kd-Trees, spatial data structure used (amongst others) in Real-Time Raytracing, has the downside that triangles that cross intersect the different spaces need to be clipped. Generally BVH's are faster because they are more lightweight. * MX-CIF Quadtrees, store bounding boxes instead of arbitrary point sets by combining a regular quadtree with a binary tree on the edges of the quads. * HAMT, hierarchical hash map with access times that generally exceed O(1) hash-maps due to the constants involved. * Inverted Index, quite well known in the search-engine circles, because it's used for fast retrieval of documents associated with different search-terms. Most, if not all, of these are documented on the NIST Dictionary of Algorithms and Data Structures  Work Stealing Queue Lock-free data structure for dividing the work equaly among multiple threads http://stackoverflow.com/questions/2101789/implementation-of-a-work-stealing-queue-in- c-c  Left Leaning Red-Black Trees. A significantly simplified implementation of red-black trees by Robert Sedgewick published in 2008 (~half the lines of code to implement). If you've ever had trouble wrapping your head around the implementation of a Red-Black tree, read about this variant. Very similar (if not identical) to Andersson Trees.  Fenwick Tree. It's a data structure to keep count of the sum of all elements in a vector, between two given subindexes i and j. The trivial solution, precalculating the sum since the begining doesn't allow to update a item (you have to do O(n) work to keep up). Fenwick Trees allow you to update and query in O(log n), and how it works is really cool and simple. It's really well explained in Fenwick's original paper, freely available here: http://www.cs.ubc.ca/local/reading/proceedings/spe91-95/spe/vol24/issue3/spe884.pdf Its father, the RQM tree is also very cool: It allows you to keep info about the minimum element between two indexes of the vector, and it also works in O(log n) update and query. I like to teach first the RQM and then the Fenwick Tree.  Pairing heaps are a type of heap data structure with relatively simple implementation and excellent practical amortized performance.  Ball Trees. Just because they make people giggle. A ball tree is a data structure that indexes points in a metric space. Here's an article on building them. They are often used for finding nearest neighbors to a point or accelerating k-means.  language agnostic - What are the lesser known but cool data structures ? - Stack Overflow Enhanced hashing algorithms are quite interesting. Linear hashing is neat, because it allows splitting one "bucket" in your hash table at a time, rather than rehashing the entire table. This is especially useful for distributed caches. However, with most simple splitting policies, you end up splitting all buckets in quick succession, and the load factor of the table oscillates pretty badly. I think that spiral hashing is really neat too. Like linear hashing, one bucket at a time is split, and a little less than half of the records in the bucket are put into the same new bucket. It's very clean and fast. However, it can be inefficient if each "bucket" is hosted by a machine with similar specs. To utilize the hardware fully, you want a mix of less- and more-powerful machines.   Splash Tables are great. They're like a normal hash table, except they guarantee constant-time lookup and can handle 90% utilization without losing performance. They're a generalization of the Cuckoo Hash (also a great data structure). They do appear to be patented, but as with most pure software patents I wouldn't worry too much.  2-3 Finger Trees by Hinze and Paterson are a great functional data structure swiss-army knife with great asymptotics for a wide range of operations. While complex, they are much simpler than the imperative structures by Persistent lists with catenation via recursive slow-down by Kaplan and Tarjan that preceded them. They work as a catenable deque with O(1) access to either end, O(log min(n,m)) append, and provide O(log min(n,length - n)) indexing with direct access to a monoidal prefix sum over any portion of the sequence. Implementations exist in Haskell, Coq, F#, Scala, Java, C, Clojure, C# and other languages. You can use them to implement priority search queues, interval maps, ropes with fast head access, maps, sets, catenable sequences or pretty much any structure where you can phrase it as collecting a monoidal result over a quickly catenable/indexable sequence. I also have some slides describing their derivation and use.  Fast Compact tries: * Judy arrays: Very fast and memory efficient ordered sparse dynamic arrays for bits, integers and strings. Judy arrays are faster and more memory efficient than any binary-search-tree. * HAT-trie: A Cache-conscious Trie-based Data Structure for Strings * B-tries for Disk-based String Management  Splay Trees are cool. They reorder themselves in a way that moves the most often queried elements closer to the root.  BK-Trees, or Burkhard-Keller Trees are a tree-based data structure which can be used to quickly find near-matches to a string.  I sometimes use Inversion LIsts to store ranges, and they are often used to store character classes in regular expressions. See for example http://www.ibm.com/developerworks/linux/library/l-cpinv.html Another nice use case is for weighted random decisions. Suppose you have a list of symbols and associated probabilites, and you want to pick them at random according to these probabilities a => 0.1 b => 0.5 c => 0.4 Then you do a running sum of all the probabilities: (0.1, 0.6, 1.0) This is your inversion list. You generate a random number between 0 and 1, and find the index of the next higher entry in the list. You can do that with a binary search, because it's sorted. Once you've got the index, you can look up the symbol in the original list. If you have n symbols, you have O(n) preparation time, and then O(log(n)) acess time for each randomly chosen symbol - independently of the distribution of weights. A variation of inversion lists uses negative numbers to indicate the endpoint of ranges, which makes it easy to count how many ranges overlap at a certain point. See http://www.perlmonks.org/index.pl?node_id=841368 for an example.  Bucket Brigade They are used extensively in Apache. Basically they are a linked list that loops around on itself in a ring. I am not sure if they are used outside of Apache and Apache modules but they fit the bill as a cool yet lesser known data structure. A bucket is a container for some arbitrary data and a bucket brigade is a collection of buckets. The idea is that you want to be able to modify and insert data at any point in the structure. Lets say that you have a bucket brigade that contains an html document with one character per bucket. You want to convert all the < and > symbols into &lt; and &gt; entities. The bucket brigade allows you to insert some extra buckets in the brigade when you come across a < or > symbol in order to fit the extra characters required for the entity. Because the bucket brigade is in a ring you can insert backwards or forwards. This is much easier to do (in C) than using a simple buffer. Some reference on bucket brigades below: Apache Bucket Brigade Reference Introduction to Buckets and Brigades  Zippers A zipper is a concept that can be used with trees, or lists, and is extremely useful in functional programming. The idea is to store a linked data structure with a "context," and then use the context to move up and down the linked data structure when necessary. See http://www.haskell.org/haskellwiki/Zipper for a much better description.   Fenwick trees (or binary indexed trees) are a worthy addition to ones toolkit. If you have an array of counters and you need to constantly update them while querying for cumulative counts (as in PPM compression), Fenwick trees will do all operations in O(log n) time and require no extra space. See also this topcoder tutorial for a good introduction.   I really really love Interval Trees. They allow you to take a bunch of intervals (ie start/end times, or whatever) and query for which intervals contain a given time, or which intervals were "active" during a given period. Querying can be done in O(log n) and pre-processing is O(n log n).  language agnostic - What are the lesser known but cool data structures ? - Stack Overflow   PQ-Trees  Someone else already proposed Burkhard-Keller-Trees, but I thought I might mention them again in order to plug my own implementation. :) http://well-adjusted.de/mspace.py/index.html There are faster implementations around (see ActiveState's Python recipes or implementations in other languages), but I think/hope my code helps to understand these data structures. By the way, both BK and VP trees can be used for much more than searching for similar strings. You can do similarity searches for arbitrary objects as long as you have a distance function that satisfies a few conditions (positivity, symmetry, triangle inequality).  Arne Andersson trees are a simpler alternative to red-black trees, in which only right links can be red. This greatly simplifies maintenance, while keeping performance on par with red-black trees. The original paper gives a nice and short implementation for insertion and deletion.   I think the FM-index by Paolo Ferragina and Giovanni Manzini is really cool. Especially in bioinformatics. It's essentially a compressed full text index that utilizes a combination of a suffix array and a burrows-wheeler transform of the reference text. The index can be searched without decompressing the whole index.  Ternary Search Tree * Quick prefix search (for incremental autocomplete,etc) * Partial Matching (When you want to find all words within X hamming distance of a string) * Wildcard Searches Quite Easy to implement.  #### ■_ エクササイズ重要 以前、二分探索を書かせてもきちんとそれを書けるプログラマーというのは少ない という話題を紹介しましたが、その続き。 Common bugs and why exercises matter (binary search part 2) | The Reinvigorated Programmer Common bugs and why exercises matter (binary search part 2) Posted on April 21, 2010 by Mike Taylor| 62 Comments Many, many thanks to all of you who have contributed to the astonishing response to the previous article on the difficulty of getting binary search right. I've never seen anything like this: in just over 24 hours since I posted the article, 541 comments have been posted, and they're still rolling in. (That number is a bit inflated by people re-posting the same code-samples multiple times after seeing WordPress mangle their source, but even taking that into account it still amazes me.) You guys rule. Seriously. I have a lot to say in response to it all. Read on … (さくっと略) Anyway, even allowing the various biasing factors, it seems possible — maybe even likely — that our hit rate is better than 10%. Common bugs (よく見受けられたバグ) It's been interesting to see that many of the programs that had bugs were going wrong in the same ways. Common bugs included: * Not dealing correctly with a zero-element array 要素がゼロ個の配列を正しく扱っていない * Not dealing correctly with a one-element array 要素が一個の配列を正しく扱っていない * Not finding a value in the last element of the array 配列の末尾の要素にある値を見つけられない * Not dealing with failed searches (the sought element is not there) 検索に失敗した(検索対象が存在していない)場合に対処していない * Not coping with repeated elements in the array 配列中の要素に繰り返しがあったときの対処がない * Not omitting the midpoint from the subranges when narrowing the range after a probe. (This results in an infinite loop for some inputs.) probe のあとで範囲を狭めるときに部分範囲から midpoint を除去していない (一部の入力に対して無限ループになってしまう可能性がある)。 * Pointing to the first and last members of the range but coding as though the l ast-pointer variable pointed to the element after the range (or vice versa). 対象となる範囲の先頭や末尾を point するが、その先が範囲から外れたものを 指しているコーディングをしている (And, yes, not dealing with a zero-element array is a bug, for those commenters who asked. Zero is a perfectly good number of elements for an array to have, and an array search that fails for empty arrays is no more acceptable than an addition operator that fails if you add zero. A few commenters also asked whether it's OK to ignore the case when a null Array reference is passed in: yes, of course — that would violate the precondition that the search is in a pre-sorted array, so all bets are off. No array at all is completely different from an empty array.) The interesting thing about these common failure modes is that they are all so predictable. If, before starting to code, we'd each taken two minutes to write down the ways in which we could imagine our programs failing, I bet we'd all have listed most if not all of these. Yet even knowing that, we find them hard to avoid. In light of the list of common bugs, we can easily see what kinds of test cases we need to include in the suite: empty arrays, single-element arrays, arrays where the sought element is absent, or appears only at the end, or only at the start, or multiple times; and of course a healthy does of randomly generated sorted arrays. Looking at the list of bugs, I find myself thinking of the rules in Kernighan and Pike' s Elements of Programming Style. Back when I reviewed that book, and listed the rules that are gathered at the end of the article, some commenters felt that they were so obvious, so trivial, as to be worthless or even insulting. And yet most of the mistakes listed above are covered in K&P's list. A few of the relevant rules are: * “Write clearly — don't be too clever.” 明確わかりやすくに書こう。小賢しくならず。 * “Make sure your code “does nothing” gracefully.” * “Watch out for off-by-one errors.” 一個外れのエラーには注意しよう。 * “Take care to branch the right way on equality.” 正しいやり方で等価性を判定して注意深く分岐しよう * “Make sure special cases are truly special.” 特殊なケースは本当に特殊なものにしよう。 * “Choose a data representation that makes your program simple.” プログラムをシンプルにするデータの表現を選択しよう #書法ひいたほうが良かったかな Then there is the rule that the I didn't let you follow literally, but the spirit of which should pervade our design process: “Test programs at their boundary values”. And the very general but very relevant rule “Don't stop with your first draft.” And of course I must mention my favourite: “Say what you mean, simply and directly.” Why does this exercise matter? 以下略  このプログラム書法、オススメではあるのだけどいかんせん年代ものなので、 ↓のほうが読むには良いかも #### ■_ ### 2011年03月24日 #### ■_ 時間が足りない感が 「構造体渡し」って聞かないよなあ "構造体渡し" - Google 検索 #### ■_ Why “Volatile” Fixes the 2.2250738585072011e-308 Bug - Exploring Binary Why “Volatile” Fixes the 2.2250738585072011e-308 Bug (なぜ“Volatile”で 2.2250738585072011e-308 Bug を修正できたのか) By Rick Regan (Published January 26th, 2011) Copyright c 2008-2011 Exploring Binary http://www.exploringbinary.com/why-volatile-fixes-the-2-2250738585072011e-308-bug/ Recently I discovered a serious bug in x87 builds of PHP: PHP's decimal to floating-point conversion routine, zend_strtod(), went into an infinite loop when converting the decimal string 2.2250738585072011e-308 to double-precision binary floating-point. This problem was fixed with a simple one line of code change to zend_strtod.c: Recently I discovered a serious bug in x87 builds of PHP: decimal から floating-point へ変換するPHP の ルーチン zend_strtod() は decimal 文字列 2.2250738585072011e-308 を倍精度二進浮動小数点数に 変換するときに無限ループに落ち込んでいました。 この問題は zend_strtodd.c に対する単純な一行の変更によって解決されました: This line この行が double aadj, aadj1, adj; was changed to このように変更されました volatile double aadj, aadj1, adj; Why does this fix the problem? I uncovered the very specific reason: it prevents a double rounding on underflow error. なぜこれが問題を解決したのでしょうか? I uncovered the very specific reason: it prevents a double rounding on underflow error. The Failing Calculation in zend_strtod() zend_strtod() における Failing Calculation For input “2.2250738585072011e-308”, zend_strtod() is supposed to return 0x0.fffffffffffffp-1022, or in normalized binary scientific notation, “2.2250738585072011e-308”という入力に対し、 zend_strtod() は 0x0.fffffffffffffp-1022 もしくは正規化された二進科学表記 (normalized binary scientific notation) で 1.111111111111111111111111111111111111111111111111111 x 2-1023. を返します。 This is a subnormal number, 2-1074 below DBL_MIN, which is 2-1022. これは非正規化数(subnormal number) で、 Instead, zend_strtod() goes into an infinite loop. This section of code, appearing inside a loop, is responsible for the problem: ところが zend_strtod() は無限ループへと落ち込みます。 ループの内側にあるコードのこのセクションは rsponsible for the problem のように見えます。 /* Compute adj so that the IEEE rounding rules will * correctly round rv + adj in some half-way cases. * If rv * ulp(rv) is denormalized (i.e., * y <= (P-1)*Exp_msk1), we must adjust aadj to avoid * trouble from bits lost to denormalization; * example: 1.2e-307 . */ if (y <= (P-1)*Exp_msk1 && aadj >；= 1.) { aadj1 = (double)(int)(aadj + 0.5); if (!dsign) aadj1 = -aadj1; } adj = aadj1 * ulp(value(rv)); value(rv) += adj; The two lines I've highlighted adjust the double-precision converted result, value(rv), so that it converges to the correctly rounded result. Although the source code expression of the calculation is correct, it fails when implemented in extended-precision arithmetic ? which is what happens when the variable adj is not declared volatile. zend_strtod() Executes In Double-Precision Mode zend_strtod() は倍精度モードで実行されている zend_strtod() executes in double-precision mode, so this is not your run-of-the-mill double-precision/extended-precision mismatch. So how does extended-precision come into play? Extended-precision numbers in the double-precision subnormal exponent range (-1023 through -1074) have 53 bits of precision, not the usual 1 to 52 bits. As I'll show, this leads to a special case of double rounding error ? a double rounding on underflow error. The result is that value(rv) remains unchanged ? and zend_strtod() goes into an infinite loop! Volatile Almost Forces Double-Precision Semantics volatile は double-precision semantics をほぼ強制する Floating-point calculations done with x87 FPU instructions are done in extended-precision registers, even when the processor is in double-precision mode and the variables being operated on are double-precision. This means that true double-precision semantics can't be emulated. Why? Because extended-precision values have a wider exponent range. This can almost be fixed by declaring double variables ‘volatile'. x87 FPU命令を使った浮動小数点数演算は、プロセッサーが倍精度モードで変数も倍精度のもの であったとしても拡張精度を持ったレジスターで行われます。これは本当の倍精度実数の セマンティクスがエミュレートできないことを意味します。なぜでしょうか? なぜなら拡張精度の値はより広い範囲の指数部を持っているからです。 これは、double の変数を 'volatie' として宣言してやることでほとんど解決されます。 The ‘volatile' keyword forces the compiler to store a variable in memory each time it's written, and to load it from memory each time it's read. This ensures that the exponent is constrained to the double-precision range; a number that is too large for double-precision will overflow, and a number that is too small will underflow. So far, so good. 'volatile'キーワードは、コンパイラーに対して変数に書き込みするたびにメモリーに格納 するように強制し、読み出しのたびにメモリーからロードするように強制するものです。 これは、指数部が倍精度の範囲に収まることを保証します。倍精度実数には大きすぎる 数値はオーバーフローを起こし、小さすぎる数値はアンダーフローを起こします。 But there remains one problem, in the handling of underflow: subnormal numbers will be doubly rounded. In double-precision mode, extended-precision results are rounded to 53 bits. If the result is subnormal relative to double-precision, it will be rounded again when written to memory ― to between 1 and 52 significant bits, the range of precision for a subnormal double-precision number. しかし問題がまだ一つ残っています。それはアンダーフローの取り扱いです:非正規化数は二重 丸めされてしまいます。倍精度モードでは、拡張精度の結果は53ビットに丸められます。ここで もしその丸めの結果が倍精度において subnornal reative であった場合、それをメモリーに書 き戻す際に再度丸めが行われて非正規化倍精度数の range of precision を1ビットから52ビッ トのsignificant bits に収まるように That said, the ‘volatile' keyword does fix the PHP problem at hand. It doesn't eliminate double rounding, but it does eliminate the specific double rounding error that causes the problem. (Does this mean there are other double rounding errors lurking? If so, hopefully the result is just an incorrect conversion, and not another infinite loop.) 'volatile' キーワードは PHP の問題を at hand で解決するのですが、二重丸めを ekiminate することはありません。ただし、問題を引き起こすような特定の倍精度丸めのエラーを eliminate します。 The Generated Code Without ‘Volatile' 'Volatile'なしで生成されたコード I took a fixed version of PHP ― PHP 5.3.5 ― and removed the ‘volatile' keyword to restore it to its pre-fixed state. I built it on an Ubuntu Linux system and traced it using the gdb debugger. Without the ‘volatile' keyword, the two highlighted lines of source code compile to these instructions: 0x8302a4a call 0x8301290 <ulp> ... 0x83012b9 fldl -0x10(%ebp) ; load ulp(value(rv)) 0x83012bc leave 0x83012bd ret ... 0x8302a4f mov -0x8c(%ebp),%eax 0x8302a55 fldl -0x30(%ebp) ; load value(rv) 0x8302a58 fldl -0x98(%ebp) ; load aadj1 0x8302a5e fmulp %st,%st(2) ; adj = aadj1 * ulp(value(rv)) 0x8302a60 faddp %st,%st(1) ; value(rv) += adj 0x8302a62 fstpl -0x30(%ebp) ; store value(rv) The problem occurs because the result of the multiplication, adj, is left on the FPU register stack, where it is then added to value(rv). This leads to a double rounding on underflow error, which I'll explain below. 乗算の結果 adj がFPU レジスタースタック上に置かれたままで、続いて value(rv) と 加算を行うために問題は発生します。これはアンダーフローエラー時の二重丸め (double rounding on underflow error) に繋がります。 The result of the fmulp instruction (adj) Before the fmulp instruction executes, the value of its two operands ― aadj1 and ulp(value(rv)) ― are on the stack (not shown). This is the FPU stack after the fmulp instruction at 0x8302a5e executes (I formatted gdb's output for display): fmulp 命令の実行前、その二つのオペランド aadj1 と ulp(value(rv)) はスタック上にあります。 以下に示すのは 0x8302a5e で fmulp 命令を実行したあとのFPUスタックです (gdb の画面出力を整形しました): st0 2.2250738585072013830902327173324041e-308 (raw 0x3c018000000000000000) st1 -2.8309023271733244206437071197035129e-324 (raw 0xbbcc92aee22acdf2d000) st2 0 (raw 0x00000000000000000000) st3 0 (raw 0x00000000000000000000) st4 0 (raw 0x00000000000000000000) st5 0 (raw 0x00000000000000000000) st6 0 (raw 0x00000000000000000000) st7 -0.57298100991279032889735844946699217 (raw 0xbffe92aee22acdf2d000) The result of the multiplication, adj, is in st1: 0xbbcc92aee22acdf2d000. That hexadecimal value represents an extended-precision floating-point number, which has three fields: a 1 bit sign field, a 15 bit excess-16383 exponent field, and a 64 bit fraction field (there is no implicit leading 1 bit in extended-precision). Translated into binary and separated into fields, 0xbbcc92aee22acdf2d000 is 乗算の結果である adj は st1 にありその内容は 0xbbcc92aee22acdf2d000 です。 この十六進値は拡張浮動小数点数を表現していて、三つのフィールドに分かれます。 二進で表し、またフィールドごとに分割すると、0xbbcc92aee22acdf2d000 は次のようになります 1 011101111001100 1001001010101110111000100010101011001101111100101101000000000000 The exponent field 011101111001100 is 15308 in decimal, and after subtracting 16383, represents an exponent of -1075. So the number this floating-point value represents is 指数部の 011101111001100 は十進表記すると 15308 で、 -1.001001010101110111000100010101011001101111100101101 x 2-1075. The Result of the faddp Instruction (value(rv)) Before the faddp instruction executes, the stack contains its two operands: the initial value of value(rv) in st0 (loaded with fldl instruction at address 0x8302a55) and adj in st1.\^ value(rv) is 0x3c018000000000000000: faddp の実行前、スタックには二つのオペランドがあります: st0 にある value(rv) の初期値と st1 にある adj です 0 011110000000001 1000000000000000000000000000000000000000000000000000000000000000. This represents 2-1022. This is the FPU stack after the faddp instruction at 0x8302a60 executes: faddp 命令が 0x8302a60 で実行されたあとのFPUスタックはこうなります: st0 2.225073858507201136057409796709132e-308 (raw 0x3c00fffffffffffff800) st1 0 (raw 0x00000000000000000000) st2 0 (raw 0x00000000000000000000) st3 0 (raw 0x00000000000000000000) st4 0 (raw 0x00000000000000000000) st5 0 (raw 0x00000000000000000000) st6 -0.57298100991279032889735844946699217 (raw 0xbffe92aee22acdf2d000) st7 2.2250738585072013830902327173324041e-308 (raw 0x3c018000000000000000) The result ― the new value of value(rv) ― is in st0; it is 0x3c00fffffffffffff800: 計算の結果、つまり value(rv) の値は st0 にあり、その内容は 0x3c00fffffffffffff800 です。 0 011110000000000 1111111111111111111111111111111111111111111111111111100000000000 This represents the 53 significant bit value 1.1111111111111111111111111111111111111111111111111111 x 2-1023. The Result of the fstpl Instruction (value(rv)) The fstpl at address 0x8302a62 stores the copy of value(rv) in st0 back to memory, which converts it to double-precision. Since the value of st0 is a subnormal number in double-precision, it must be rounded to 52 bits. This rounds value(rv) to 2-1022 (which ironically makes it normal). In other words, value(rv) remains unchanged! 0x8302a62番地にある fstpl はst0に置かれている value(rv) のコピーを倍精度に変換してメモ リーに書き戻します。st0の値は倍精度の非正規化数 (subnormal number)なので、52ビットに丸 める必要があります。この丸めは value(rv) を 2-1022 にします (which ironically makes it normal)。 言い換えると、value(rv) は変更されないままなのです! What Went Wrong: A Double Rounding on Underflow Error 何が間違ったのか: アンダーフローエラーに対する二段階丸め You have to look at the calculation of value(rv) + adj more closely to see what happened. なにが起こったのかを理解するためには、より詳しく value(rv) + adj の計算について 見ていく必要があります。 value(rv) = 2-1022 adj = -1.001001010101110111000100010101011001101111100101101 x 2-1075 If you add them by hand, using binary addition, you get this 104-bit result (I'd have shown you my calculation, but alignment of the exponents made the lines too long): これをbinary addition を使って手計算で加算した場合、その結果は 104ビットの長さを持つものになります。 1.1111111111111111111111111111111111111111111111111110110110101010001 000111011101010100110010000011010011 x 2-1023 The highlighted bit is significant bit 54; since the value of bit 54 and beyond is greater than 1/2 ULP, this number, to 53 bits, rounds up to 1.1111111111111111111111111111111111111111111111111111 x 2-1023 As I said above, this is a subnormal number in double-precision, so it must now be rounded to 52 bits. The highlighted bit ? bit 53 ? is 1, and all bits beyond are zero. This is a halfway case, so the tie is broken with the round-half-to-even rule. In this case, that means rounding up to 2-1022. If the result were rounded only once, directly to 52 bits, it would have been correct: 1.111111111111111111111111111111111111111111111111111 x 2-1023 In other words, this is a double rounding error. It essentially undoes the subtraction. The second rounding adds back 2-1074, which is the amount we were really trying to subtract in the first place! The Generated Code With ‘Volatile' After restoring the ‘volatile' keyword to the source code, I rebuilt PHP and traced it again. With the ‘volatile' keyword, the two highlighted lines of source code compile to these instructions: 0x8302ba4 call 0x8301290 <ulp>； ... 0x83012b9 fldl -0x10(%ebp) ; load ulp(value(rv)) 0x83012bc leave 0x83012bd ret ... 0x8302ba9 mov -0xac(%ebp),%eax 0x8302baf fldl -0x38(%ebp) ; load aadj1 0x8302bb2 fmulp %st,%st(1) ; adj = aadj1 * ulp(value(rv)) 0x8302bb4 fstpl -0x40(%ebp) ; store adj 0x8302bb7 fldl -0x48(%ebp) ; load value(rv) 0x8302bba fldl -0x40(%ebp) ; load adj 0x8302bbd faddp %st,%st(1) ; value(rv) += adj 0x8302bbf fstpl -0x48(%ebp) ; store value(rv) In this code, the fmulp and faddp instructions are separated by a load/store sequence, forcing adj to double-precision before adding it to value(rv). This sidesteps the double rounding issue. The Result of the fmulp Instruction (adj) This is the FPU stack after the fmulp instruction at 0x8302bb2 executes: st0 -2.8309023271733244206437071197035129e-324 (raw 0xbbcc92aee22acdf2d000) st1 0 (raw 0x00000000000000000000) st2 0 (raw 0x00000000000000000000) st3 0 (raw 0x00000000000000000000) st4 0 (raw 0x00000000000000000000) st5 0 (raw 0x00000000000000000000) st6 2.2250738585072013830902327173324041e-308 (raw 0x3c018000000000000000) st7 -0.57298100991279032889735844946699217 (raw 0xbffe92aee22acdf2d000) The result of the multiplication, adj, is the same as in the unfixed code, except it lives in st0: -1.001001010101110111000100010101011001101111100101101 x 2-1075. The Result of the fstpl Instruction (value(rv)) The fstpl at address 0x8302bb4 stores the copy of adj in st0 back to memory, which converts it to double-precision. Since the value of st0 is a subnormal number in double-precision, it must be rounded to one bit, since that's all that remains of the 52 maximum bits of subnormal precision at exponent 2-1074. To see where the rounding occurs, rewrite st0 as 丸めが行われている場所を確認するために st0 を以下のように書き換えます -0.1001001010101110111000100010101011001101111100101101 x 2-1074 This rounds up to -2-1074, the smallest value double-precision can represent. The Result of the fldl Instruction (adj) This is the FPU stack after the fldl instruction at 0x8302bba executes: st0 -4.9406564584124654417656879286822137e-324 (raw 0xbbcd8000000000000000) st1 2.2250738585072013830902327173324041e-308 (raw 0x3c018000000000000000) st2 0 (raw 0x00000000000000000000) st3 0 (raw 0x00000000000000000000) st4 0 (raw 0x00000000000000000000) st5 0 (raw 0x00000000000000000000) st6 0 (raw 0x00000000000000000000) st7 2.2250738585072013830902327173324041e-308 (raw 0x3c018000000000000000) This confirms that adj, which is now in st0 as 0xbbcd8000000000000000, contains -2-1074: 1 011101111001101 1000000000000000000000000000000000000000000000000000000000000000 The Result of the faddp Instruction (value(rv)) This is the FPU stack after the faddp instruction at 0x8302bbd executes: 0x8302bbd で faddp 命令を実行した後のFPUスタックはこうなります: st0 2.2250738585072008890245868760858599e-308 (raw 0x3c00fffffffffffff000) st1 0 (raw 0x00000000000000000000) st2 0 (raw 0x00000000000000000000) st3 0 (raw 0x00000000000000000000) st4 0 (raw 0x00000000000000000000) st5 0 (raw 0x00000000000000000000) st6 2.2250738585072013830902327173324041e-308 (raw 0x3c018000000000000000) st7 -4.9406564584124654417656879286822137e-324 (raw 0xbbcd8000000000000000) The addition works this time, changing value(rv), making it 0x3c00fffffffffffff000: 0 011110000000000 1111111111111111111111111111111111111111111111111111000000000000 This is 1.111111111111111111111111111111111111111111111111111 x 2-1023, the correctly rounded result. (It is 52 bits, so when it's stored back to memory with the subsequent fstpl instruction, no rounding occurs.) So the loop now terminates, after only one adjustment to value(rv). Summary まとめ PHP got stuck converting a decimal number whose double-precision binary floating-point representation is DBL_MIN ― 2-1074, a subnormal number at the normalized/unnormalized number boundary. Its conversion routine started with an approximation of DBL_MIN and tried to adjust it by subtracting a number a tad smaller than 2-1074 ― a number too small for a double to represent. The addition of the ‘volatile' keyword forced the adjustment to be rounded to 2-1074, which allowed the subtraction to work as intended ― and avoid a double rounding on underflow error. PHP は、正規化数と非正規化数の境界にある非正規化数であるDBL_MIN - 2^-1074 を表現するような 倍精度の二進浮動小数点数の十進数値への変換で stuck してしまいます。 PHPの変換ルーチンはBDL_MINの近似値から始めて、2^-1074 より小さな tad を引くことで 調整を試みますが、その数値は倍精度で表現するには小さすぎるものなのです。 'volatile' キーワードを追加することで adjustment を強制的に 2^-1074 にします。 この値は意図したように減算が働くようにするもので、アンダーフローにおける 二重丸めを排除するものです。 A Good Quote I found this quote regarding double rounding on underflow, in an appendix of “What Every Computer Scientist Should Know About Floating-Point Arithmetic” titled “Differences Among IEEE 754 Implementations” (see subsection “Programming Language Support for Extended Precision”, bullet 4): “Of course, this form of double-rounding is highly unlikely to affect any practical program adversely.” :) Copyright c 2008-2011 Exploring Binary  #### ■_ 短絡評価 これを見れば、最初から今のような動作であろうことは予想できますね。 Chistory Neonatal C Rapid changes continued after the language had been named, for example the introduction of the && and || operators. In BCPL and B, the evaluation of expressions depends on context: within if and other conditional statements that compare an expression's value with zero, these languages place a special interpretation on the and (&) and or (|) operators. In ordinary contexts, they operate bitwise, but in the B statement if (e1 & e2) ... the compiler must evaluate e1 and if it is non-zero, evaluate e2, and if it too is non-zero, elaborate the statement dependent on the if. The requirement descends recursively on & and | operators within e1 and e2. The short-circuit semantics of the Boolean operators in such truth-value' context seemed desirable, but the overloading of the operators was difficult to explain and use. At the suggestion of Alan Snyder, I introduced the && and || operators to make the mechanism more explicit.  #### ■_ データ構造 あまり知られていないけどイカしたデータ構造は? language agnostic - What are the lesser known but cool data structures ? - Stack Overflow There a some data structures around that are really cool but are unknown to most programmers. Which are they? Everybody knows linked lists, binary trees, and hashes, but what about Skip lists, Bloom filters for example. I would like to know more data structures that are not so common, but are worth knowing because they rely on great ideas and enrich a programmer's tool box. リンクつきリストや、二分木、ハッシュといったものは誰でも知っています。一方で、 たとえばスキップリストやブルームフィルターといったものはそうではありません。 わたしはそういった、一般的ではないけれども知っておくべきデータ構造を もっと知りたいのです。それは、そのようなデータ構造が great ideas に基づいたものであり プログラマーの道具箱を enrich するものだからです。 PS: I am also interested on techniques like Dancing links which make interesting use of the properties of a common data structure. EDIT: Please try to include links to pages describing the data structures in more detail. Also, try to add a couple of words on why a data structures is cool (as Jonas Kölker already pointed out). Also, try to provide one data-structure per answer. This will allow the better data structures to float to the top based on their votes alone.  結構盛り上がってますが回答編は略。 #### ■_ #### ■_ ### 2011年03月23日 #### ■_ ・ちょっと大きめに表紙絵 買った。まだ読んでない。 この表紙絵の場面がどういう局面なのかわからないけど、惹かれるものがあるなあ。 一巻では感じなかったけど。なんで? #### ■_ not 書評 俺のコードのどこが悪い?―コードレビューを攻略する40のルール 藤原 克則 まとめてから 技術系執筆情報 に気づいただよ○|￣|＿ まず、着眼点はいいと思うんですよ。 この本の帯に「40のルール」ってのが書かれているんですが、それは 機能充足性 01 if/else 文での判定条件は適切ですか? 02 データ型は適切ですか? 03 ループの終了判定は妥当ですか? 04 switch文のcaseラベル列挙は十分ですか? 05 引数チェックは十分ですか? 06 戻り値チェックは十分ですか? リソース消費 07 引数チェックが多過ぎませんか? 08 戻り値チェックが多過ぎませんか? 09 関数の局所変数が多過ぎませんか? 10 各行の負荷を把握していますか? 11 似たような処理があちこちにありませんか? 12 不要な条件判定をしていませんか? 13 構造体渡しは必要ですか? 14 不要な関数呼び出しをしていませんか? 15 関数呼び出しのコストを把握していますか? 16 例外に頼り過ぎていませんか? 17 スレッド間同期の方法は適切ですか? 18 アルゴリズムやデータの選択は適切ですか? 19 コンパイラの最適化に期待し過ぎていませんか? 20 ファイルアクセスは妥当ですか? 実行安全性 21 入力データは検証済みですか? 22 スタック消費量は妥当ですか? 23 ヒープ消費量は妥当ですか? 24 スレッド数は十分ですか? 25 ファイル操作は妥当ですか? 26 ネットワークアクセスは妥当ですか? 27 異常が検出された際の挙動は適切ですか? 保守性 28 横に長いコードになっていませんか? 29 空行を適切に挿入していますか? 30 プロジェクトのコーディング規約に従っていますか? 31 自分のコーディングスタイルは一貫していますか? 32 機能を詰め込み過ぎていませんか? 33 変更をどこまで想定していますか? 34 名前付けは適切ですか? 拡張性 35 無駄な拡張性にこだわっていませんか? 36 十分な情報を受け渡していますか? テスト容易性 37 環境に依存する個所は把握していますあ? 38 テスト環境の高地は簡単ですか? 39 日付と時刻の扱いを考えていますか? 40 どこまで細分化したテストを行えますか?  本書ではこれをもっと掘り下げて書いていっています。が、 ところどころ挟まっている「コラム」で致命的ともいえる間違い(勘違い)が。 訂正のページもあるみたいですけど(さっきのリンク)、 それでいいのかなあという気がしないでも (回収しろとはいわんけど)。 具体的にはこう。 まず42ページ。 条件判定の評価順序 C/C++の言語仕様では、次のような条件判定が記述された場合、変数 i/j/k それぞれに対する 判定の順序を、「コンパイラの実装依存」としています。 01.5 if((i == 1) || (j == 2) || (k == 3)) {...} 「コンパイラの実装依存」というのは、「コンパイラによって処理順序が異なるコードが生成される」 可能性があるということを意味します。 「処理順序が異なるコード」の例として、次のようなバリエーションが考えられます。 ・i⇒j⇒kで判定(記述順) ・k⇒j⇒iで判定(逆順) ・生成コードの状態に応じて順序を変更 次のような条件判定の場合を考えて見ましょう(ママ)。 01.6 条件判定が記述順で実施される分には、このプログラムは期待通りに機能します。 しかし、記述と逆順に判定される場合には、変数p経由でのf1参照が先に処理されますので、 変数pがNULLだった際には、このプログラムは異常終了してしまいます。 これらのことから、厳密に言うなら、依存関係を持つ条件判定を、1つの式として記述しては いけません。 もっとも、このような評価順序に依存した既存のプログラムが多数あることから、一般的な 用途で使用されるコンパイラは、記述順序通りに処理するコードを生成します。 以前であれば、「組み込み機器向けには様々なコンパイラ実装がある」という理由が説得力を 持っていましたが、今時は組み込み機器向けの開発であっても、GCCベースのコンパイラを使用する ケースが増えていますので、この理由も徐々に説得力を失っています。 しかし、評価順序に依存した条件判定は、言語仕様上では「たまたま動作するに過ぎない」もの であることは、意識しておく必要があります。  先のリンクで訂正がなされてますが、いくらなんでもこれはといいたくなります。 規格から引くと 5.1.23 プログラムの実行 この規格における意味規則の規定では、最適化の問題を無視して抽象計算機の動作として記述する。 ボラタイルオブジェクトへのアクセス、オブジェクトの変更、ファイルの変更、又はこれらのいずれか の操作を行う関数の呼び出しは、すべて副作用(side effect) と呼び、実行環境の状態に変化を生じる。 式の評価は、副作用を起こしてもよい。副作用完了点(sequence point)と呼ばれる実行環境における 特定の点において、それ以前の評価にともなう副作用は、すべて完了していなければならず、 それ以降の評価に伴う副作用が既に発生していてはならない。(副作用完了点の要約を附属書Cに示す)  6.5.13 論理AND演算子 意味規則 &&演算子の結果の型は、両オペランドの値が0と比較してともに等しくない場合は1、 それ以外の場合は0とする。結果の型は、intとする。 ビット単位の2項&と異なり、&&演算子は左から右への評価を保証する。第1オペランドの評価の直後を 副作用完了点とする。第1オペランドの値が0と比較して等しい場合、第2オペランドは評価しない。  6.5.14 論理OR演算子 意味規則 || 演算子の結果の型は、両オペランドの値が0と比較していずれか一方でも等しくない場合は1、 それ以外の場合は0とする。結果の型は、intとする。 ビット単位の2項|と異なり、||演算子は左から右への評価を保証する。第1オペランドの評価の直後を 副作用完了点とする。第1オペランドの値が0と比較して等しくない場合、第2オペランドは評価しない。  このあたりですか。 で、情報ページには 技術系執筆情報 1990 年代初頭に C 言語を習い憶えた当時の 「演算の評価順序は未定義」という記憶から、 論 理演算における評価順序もてっきりコンパイラ依存だと思い込んでいたのですが、 論理演算(+ 幾つかのケース )における評価順序は、 言語仕様において保証されています。 C99 以前の仕様に関しては、 正式な文書を確認できなかったのですが、 歴史的経緯に関する注 記等が特に無いことから、 ANSI 等で標準化された時点からの仕様のように思われます。  自分が最初に学んだときは～という空気を感じないでもないですがそれはさておき、 ANSI どころかもっと前からそうであるという状況証拠があります。 カーニハン大先生の「なぜ私はPascalが嫌いか」から。  3.制御の流れ Pascal では制御の流れの欠陥は小さいが、数が多いのです。急所を一突きされて死ぬというよりも、 何千もの切り傷のために死ぬという感じです。 論理演算子 and と or で評価の順序がはっきりと決まっていません。C ならば&&でも||でも 明確に決まっています。この欠点は、他の多くの言語においてもそうです。ループの制御で最も しばしば問題になります。たとえば、 while(i<=XMAX) and (x[i]>0)do ... というのは、Pascal の非常に悪い使い方になります。なぜなら、iのテストがx[i]のテストより 先に行われるという保証がないからです。  コンパイラーの実装依存であるなんてことならこうは書けないですよね。 #### ■_ もう一個。こっちもすでにほかの人が指摘していますが p59 column enumを持たない言語 Javaの場合、言語仕様としてenumに相当する定数定義機能を持ちません。 また、caseラベル値に変数を使用することができません。 しかし、変数定義をfinal化することで、その変数は固定値とみなされますので、case ラベル値として使用することができるようになります。 04.4 public static final int SLEEPING = 0; public static int INTPROGRESS = 1; .... switch(status){ /* final 化されている変数なので使用可能 */ case SLEEPING: .... 0 向け処理 ....; break; /* final 化されていない変数なのでコンパイルエラー */ case INTPROGRESS: .... 1向け処理 ....; break;  えーとまあそのなんだ。 #### ■_ #### ■_ あとでよむ これとか。 The pursuit of excellence in programming As I write a series of thoughts on the pursuit of excellence in programming, I must preface my essay by asking you to ignore that I wrote these words. I invite you to evaluate the opinions and ideas presented here not ad hominem, but rather on the basis of their own merits. It would be easy to otherwise mistakenly dismiss them with the infamous question posed by Steve Jobs to a blogger: “What have you done that's so great?”. 以下略  #### ■_ これとか。しかし Zed、なんつードメイン名を。 Programming, Motherfucker We are a community of motherfucking programmers who have been humiliated by software development methodologies for years. We are tired of XP, Scrum, Kanban, Waterfall, Software Craftsmanship (aka XP-Lite) and anything else getting in the way of...Programming, Motherfucker. We are tired of being told we're autistic idiots who need to be manipulated to work in a Forced Pair Programming chain gang without any time to be creative because none of the 10 managers on the project can do...Programming, Motherfucker. We must destroy these methodologies that get in the way of...Programming, Motherfucker.  #### ■_ ### 2011年03月22日 #### ■_ #### ■_ アドバイス求む java - coding best practice advise needed - Stack Overflow I want to understand if the following coding practice is good or bad practice. 以下に示す coding practice が果たして good practice なのか bad practice なのか知りたいのです。 I pass a Value Object to a method as a parameter and the called method is returning the same parameter Value Object. I particulary think since it the same object be referenced we dont need to put it as return type. わたしは Value オブジェクトをパラメーターとしてメソッドにわたし、呼び出された メソッドはパラメーターと同じ Value オブジェクトを返しています。 戻り値の型とする必要のない同一のオブジェクトであるので、これでいいのだろうかと 考えています。 Class A { initStudent() { Student studentObj = new Student(); //do some processing studentObj = processStudent(studentObj); } processStudent(Student pObj) { //do something return pObj; } } java coding-style coding-standards programming-practices coding-practices  it is sometimes needed to get a parameter and return a parameter of the same type, even if the manipulation is on the parameter for some reasons: 1. if the object is immutable (i.e. String) 2. if the object actualy can be changed (a good example for it is <T>[] List.toArray(<T>[]) where if there is not enough space in the parameter array, a new one is created, otherwise, the array is written on the parameter) 3. it also implicitly tells the user the value might be changed... 4. don't be afraid to do it if needed, many projects use it (i.e. Apache Lucene)   Indeed you don't need to return the object. Just return nothing. Class A { void initStudent() { Student studentObj = new Student(); //do some processing processStudent(studentObj); } private void processStudent(Student pObj) { //do something } } Edit: Also, as adarshr pointed out, the Student object that you created will be of no use if it does not "go out" of the initStudent() method. Probably you may want to return it, or store it in a Collection instance field.   Yes you shouldn't be returning it. Use access modifiers and return types properly. public class A { public Student initStudent() { Student studentObj = new Student(); //do some processing processStudent(studentObj); return studentObj; } private processStudent(Student pObj) { //do something } }  とまあいろいろ。 #### ■_ ちと時間のたった話題ですが。 PHP Hangs On Numeric Value 2.2250738585072011e-308 - Exploring Binary PHP Hangs On Numeric Value 2.2250738585072011e-308 By Rick Regan (Published January 3rd, 2011) Copyright c 2008-2011 Exploring Binary http://www.exploringbinary.com/php-hangs-on-numeric-value-2-2250738585072011e-308/ I stumbled upon a very strange bug in PHP; this statement sends it into an infinite loop: わたしはPHPのある非常に奇妙なバグに遭遇しました。 次の文が無限ループに落ち込むのです: <?phpd = 2.2250738585072011e-308; ?>

(The same thing happens if you write the number without scientific notation ― 324 decimal places.)

I hit this bug in the two places I tested for it: on Windows (PHP 5.3.1 under XAMPP 1.7.3),
and on Linux (PHP Version 5.3.2-1ubuntu4.5) ― both on an Intel Core Duo processor. I've
written a bug report.

わたしは自分がテストした環境の二つWindows と Linux でこのバグに当たりました。

What's Going On?

As a string, 2.2250738585072011e-308 causes no problems; it's when it's treated as a numeric
value that the bug hits.

This works:

<?php $d = '2.2250738585072011e-308'; echo$d; ?>

but this doesn't:

<?php $d = '2.2250738585072011e-308'; echo$d + 0; ?>

I don't have a debug build of PHP available, but I did poke around in the source code
(version 5.2.8). I found zend_strtod.c, which appears to be based on a very old version of
David Gay's strtod() function. I don't know if zend_strtod() is even called in this case,
but it was simple enough for me to isolate it and run it outside of PHP ― under Visual C++
in fact. It did not hang on that input. (Update: zend_strtod() is called.) What's Special

2.2250738585072011e-308 represents the largest subnormal double-precision floating-point number;
written as a hexadecimal floating-point constant, it's 0x0.fffffffffffffp-1022.
2.2250738585072011e-308 is one of five 17-digit decimal values that convert (correctly) to 0x0.fffffffffffffp-1022:

2.2250738585072011e-308 は非正規化数の倍精度浮動小数点数で最大の値を表現しています。

2.2250738585072011e-308 は 0x0.fffffffffffffp-1022 へと(正しく)変換される
17桁の十進数値五個のうちの一つです。

* 2.2250738585072007e-308
* 2.2250738585072008e-308
* 2.2250738585072009e-308
* 2.2250738585072010e-308
* 2.2250738585072011e-308

Only 2.2250738585072011e-308 causes the problem. It happens to be the largest of the five
decimal values, so I guess that matters somehow.

If you have any thoughts on what the bug is, please let me (or PHP) know.

Is This Serious?

I've gotten no response to my bug report yet, but I can't help but wonder: is this serious?
Could you bring down a Web server by entering this constant in a PHP-based form?

Update

Yes, it's serious.

for example, see

* Reddit.
* Slashdot.
* Hacker News.
* My PHP bug report.

And as pointed out in the comments, equivalent forms of the number also cause the problem:

* Variations on the placement of the decimal point: 0.22250738585072011e-307,
22.250738585072011e-309, 22250738585072011e-324, etc.
* Trailing zeros: e.g., 2.22507385850720110e-308.
* Leading zeros in the exponent: e.g., 2.2250738585072011e-0308.
* (Combinations of the above)

You can also alter the full 324 decimal place version of the number with leading and
trailing zeros.

1/22/2011:
I thought of another set of numbers that cause the same problem ― numbers of 18 digits
or more but with the same 17 digit prefix. For example:

* 2.22507385850720111e-308
* 2.225073858507201123e-308
* 2.22507385850720113099e-308
* 2.225073858507201100001e-308
* 2.2250738585072011123456789012345678901234567890123456789e-308
* etc.

In decimal these numbers are distinct, but they all map to the same floating-point number.

It seems the only ones that cause a problem though are those whose 18th digit is 0, 1,
2, or 3 (2.22507385850720114e-308 doesn't cause the problem, for example).

Update: The Bug Has Been Fixed

The bug fix was released on 1/6/2011; this is from the PHP news release:

This release resolves a critical issue … where conversions from string to double might
cause the PHP interpreter to hang on systems using x87 FPU registers.

The problem is known to only affect x86 32-bit PHP processes, regardless of whether the
system hosting  PHP is 32-bit or 64-bit.

Update: An Analysis of the Bug and Its Fix

I've found the specific cause of the bug, and why the fix fixes it; please see my article
“Why “Volatile” Fixes the 2.2250738585072011e-308 Bug.”


Why “Volatile” Fixes the 2.2250738585072011e-308 Bug - Exploring Binary


### 2011年03月21日

#### ■_

Twitter / @Faith and Brave: Masterminds of Programming ...

Masterminds of Programmingを読むと、Objective-C設計者の老害っぷりがわかる。


がちょっと気になったり。 あの本の Objective-C の章でインタビュー受けているのは二人いるのだけど、 ふつー(ってなんですか) Objective-C の作者(設計者)として挙げられるのは おそらく Brad Cox の方。 Brad Cox, Ph.D.

んで、ざっと調べても

The A to Z of programming languages: Objective-C - a-z of programming languages, A to Z of programing languages - Computerworld

Do you have any regrets about maintaining the language i.e., any problems you didn't
resolve in the language you wish you had?

No. Lack of garbage collection was a known issue from the beginning, but an
inescapable one without sacrificing the reasons people choose C in the first place.

Was there any apprehension to selling the rights to Objective-C to Steve Jobs' NeXT in 1995?

Not really.

Do you think Objective-C would still be around today if it weren't for NeXT's acquisition?

Probably not.

How do you feel Objective-C 2.0 added to your original language? Can you still call it
yours in any sense?

I've never thought of as “mine” in any sense, certainly not emotionally. Its a
soldering gun. And lots of other people were involved in building it, not just me.

Have you played any significant part Objective-C's evolution since the acquisition?

Not really. I've moved on to larger granularity components, particularly SOA and OSGI.

Objective-C - Wikipedia

ためにStepstone社を創立した。Stepstone社は、Objective-Cに力を注いだが、それはマイナー
な存在であった。Objective-Cが認知され始めるきっかけは、1985年、アップルコンピュータを

ステムの開発を行うNeXT Computer社を創立したことに始まる。

そのマシンのユーザインタフェースは、Display PostScriptと Objective-Cで書かれた
Application Kitにより提供され、Objective-CはNeXTコンピュータの主力言語となった。その後
の歴史は、主にNeXT社とともにあり、GCCをベースにしたObjective-Cサポートが行われ、プロト
コルの導入など文法の拡張なども行われている。NeXT社による多くの成果は、GCCに還元されて
いる。

1995年には、NeXT社がStepstone社からObjective-C言語と、その商標に関する全ての権利を買い

v10.5からは一部言語仕様の変更が行われObjective-C 2.0と呼ばれる（詳細は#Objective-C 2.0
を参照）。

プル社の携帯電話iPhoneにおいてCocoa Touchフレームワークの記述言語として採用。iPhone OS
の普及にともない、習得者の人口が増える傾向にある。

Objective-C - Wikipedia, the free encyclopedia
History

Objective-C was created primarily by Brad Cox and Tom Love in the early 1980s at their
company Stepstone. (略)

Love and Cox eventually formed a new venture, Productivity Products International (PPI),
to commercialize their product, which coupled an Objective-C compiler with class
libraries. In 1986, Cox published the main description of Objective-C in its original
form in the book Object-Oriented Programming, An Evolutionary Approach. Although he
was careful to point out that there is more to the problem of reusability than just
the language, Objective-C often found itself compared feature for feature with other
languages.

Popularization through NeXT

After Steve Jobs left Apple Inc., he started the company NeXT. In 1988, NeXT licensed
Objective-C from StepStone (the owner of the Objective-C trademark) and released its
own Objective-C compiler and libraries on which the NeXTstep user interface and
interface builder were based. While the NeXT workstations failed to make a great
impact in the marketplace, the tools were widely lauded in the industry. This led NeXT
to drop hardware production and focus on software tools, selling NeXTstep (and
OpenStep) as a platform for custom programming.

The GNU project started work on its free clone of NeXTStep, named GNUstep, based on
the OpenStep standard. Dennis Glatting wrote the first GNU Objective-C runtime in 1992.
The GNU Objective-C runtime, which has been in use since 1993, is the one developed by
Kresten Krab Thorup when he was a university student in Denmark. Thorup also worked at
NeXT from 1993 to 1996.

After acquiring NeXT in 1996, Apple Computer used OpenStep in its new operating system,
Mac OS X. This included Objective-C and NeXT's Objective-C based developer tool,
Project Builder (later replaced by Xcode), as well as its interface design tool,
Interface Builder. Most of Apple's present-day Cocoa API is based on OpenStep
interface objects, and is the most significant Objective-C environment being used for
active development.



Apple 主導になってからどのように変わったのかはわからないけど、 現状の Objective-C についての不満の原因をすべて Cox (と Love) におっ被せるのは どうしたもんかなあと。 例の本の、Love のインタビューについては言語以外のところで この辺かなあと思うところがなくはなかったけども、 なんでもかんでも「老害」(って便利な言葉だよね)と片付けるのは良くないんじゃないかなあ。

こういうのもあった。

Rubyist Magazine - Ruby Conference 2004 レポート

Brad Cox による今年の RubyConf のキーノートスピーチです。 前半と後半に分けて、前半が
Objective-C とそこに至るまでの話、 後半がそれ以降、現在までの彼の関心事を話す、と前置
きしました。

まず Objective-C は彼にとってはゴールではなかった、という点を 強調していました。彼の目

いうことです。 (cooperation tool とも言ってました。グループウェアみたいなものでしょう
か?) そして、Objective-C の言語仕様や特徴をごく簡単に説明し、なんと 開始から 20 分ほど
で前半部分が終了してしまいました。 前ふりの時間も含めれば言語の話は 15 分もなかったの
ではないでしょうか。

(略)

RubyConf という世界でも有数の言語オタクが集合する機会であるにも 関わらず、反応を見てい
る限りは Objective-C の話がほとんど無くても 皆さんたのしんでおられたようです。Rubyist
の性格がわかったような。 私には、Brad Cox にとっては Objective-C は完全に過去のものに
なってしまっているんだなあ、という印象が強く残りました。


Java のライブラリについても弁護したいけどそれはまた (気力がわかない)。

まあ、「あの本読んでいてわからんようならおまえ自身も老害の一人だ」 と言われればそのとおりなんですがね ○|￣|＿

#### ■_ Don't Distract New Programmers with OOP

オブジェクト指向プログラミングで新米プログラマーの取り乱させるな。 くらい?

prog21: Don't Distract New Programmers with OOP
Don't Distract New Programmers with OOP

The shift from procedural to OO brings with it a shift from thinking about problems
and solutions to thinking about architecture. That's easy to see just by comparing a
procedural Python program with an object-oriented one. The latter is almost always
longer, full of extra interface and indentation and annotations. The temptation is to
start moving trivial bits of code into classes and adding all these little methods and
anticipating methods that aren't needed yet but might be someday.

When you're trying to help someone learn how to go from a problem statement to working
code, the last thing you want is to get them sidetracked by faux-engineering busywork.
Some people are going to run with those scraps of OO knowledge and build crazy class
hierarchies and end up not as focused on on what they should be learning. Other people
are going to lose interest because there's a layer of extra nonsense that makes
programming even more cumbersome.

At some point, yes, you'll need to discuss how to create objects in Python, but resist
for as long as you can.

(If you liked this, you might like How I Learned to Stop Worrying and Love Erlang's Process Dictionary.)


#### ■_ もしもわたしがJでプログラミングしたら

Obviously he is a stoner 何だこの言い回し。

Mind if I do a J? : programming

Obviously, you're not a golfer.



excuse me? indeed, i'm not a golfer.



Code golf.



ah, thx; indeed, i'm no golfer, though i like economy of input in my shell (and that's
partially why i looked at J) there is always M-/ in Emacs both for sanity's sake of
reading the code and not needing to be touch typer type.




"Mind if I do a J?" and "Obviously, you're not a golfer." are
quotes from the movie The Big Lebowski.



Obviously he is a stoner



You say that like it's a bad thing.



No objection here, just make sure you pass it to the left.



Oh my god. All the operators in J pass their result on the left hand side. That's brilliant.


Are you sure this is the biggest problem? What about this one? 6%3 2 or even... _5+5 0 "Everyone uses the slash for division, percent for the modulo, and minus sign for negation. That's too mainstream for us." Fucking hipster language.


My memory of J is a bit rusty, but I believe there are reasons for those points.

"/" gets appropriated to the fold adverb. This use comes directly from APL.
So it's not as if it's totally without reason, though that doesn't necessarily mean
it's a good reason.

"%" gets promoted to division duty, and "|" takes over his old job.
Which, while alien to programmers, it makes a teensy bit more sense for those familiar
with number theory and whatnot, where Mr. Pipe is used to indicate divisibility. So,
it's fairly arbitrary given that divisibility is separate from modulo, but they're
sorta close so I guess it's a bit less arbitrary than using Mr. Percent.

Lastly, the underscore is part of the number syntax, rather than a verb like the
hyphen. For example:

_1 2 3
_1 2 3
- 1 2 3
_1 _2 _3

Given that J is an array language, this behavior makes sense to me. As you pointed out,
the minus sign is generally used for negation. This seems like a sensible way to
extend that notion to arrays.

That being said, it's fine to disagree. J does stray quite a bit from the norm. And
it's a little fucking crazy at times.


I highly recommend Learning J to anyone interested in the language.

Also, if you want some interesting visual stuff, type "require 'viewmat'"
into the prompt, then apply that verb to matrixes. Damn.

    6%2 NB. That's division by the way! WHY?

Well, Ken's previous language, APL, wrote that as 6÷2 instead. But people complained
that they didn't have a ÷ sign on their keyboard, or in their font, or in fact in
their character encoding at all. So when he designed J, he used ASCII, and % is the
closest thing to ÷.

NB. comes from Latin by way of JOSS.


Because it uses / for reduce...

e.g.:

+/ 1 2 3 4 5
15

Now, / is the adverb, probably to keep the dictionary more consistent or something.


"APL is a mistake, carried through to perfection. It is the language of the
future for the programming techniques of the past: it creates a new generation of
coding bums." -Djikstra.

That said, if this is the dumbest language you've seen, you need to get out more


What's the smartest language you've ever seen? Java?



COBOL.



Java is pretty good. They made a few (self-admitted) design mistakes, like their
implementation of generics and auto-boxing, but it is generally a very good language.

Python is quite good as well, but in very different ways, and you need a paradigm
switch if you want to program effectively in Python.

C++ is quite good, as long as you impose strict guidelines on which features are not
to be used (like exceptions) and which ones are to be used in very limited
circumstances.



My post was tongue-in-cheek. Java is not a bad language, but my point was, just
because you don't understand what a language is useful for doesn't mean it's
"dumb". I'm pretty certain J is very useful to some people.

Also, learning stuff that you don't understand in the first place is a good way to
learn different ways of doing things and to keep your creativity sharp.



You don't know mathematical speed and power until you've worked with J or APL.

`

ぼつ。

リンクはご自由にどうぞ

メールの宛先はこちら