ときどきの雑記帖' 2013年10月(下旬)

2013年10月31日

■_

10月も終わり。もう朝起きる時間はまだ暗い季節に。って冬至まで二ヶ月ないんだものねえ。

■_

順調に話題が広がってるなあ Toyota's Killer Firmware - Slashdot 米国でのトヨタ車急加速事件、ファームウェアに欠陥があったとの見方ふたたび | スラッシュドット・ジャパンハードウェア

■_ (?[ ])

5.18 で使えるものだったのね perlrecharclass - perldoc.perl.org

perlrecharclass - perldoc.perl.org

Extended Bracketed Character Classes

This is a fancy bracketed character class that can be used for more readable and less error-prone classes, and
to perform set operations, such as intersection. An example is

    /(?[ \p{Thai} & \p{Digit} ])/

This will match all the digit characters that are in the Thai script.

This is an experimental feature available starting in 5.18, and is subject to change as we gain field experience
with it. Any attempt to use it will raise a warning, unless disabled via

    no warnings "experimental::regex_sets";

Comments on this feature are welcome; send email to perl5-porters AT perl.org .

We can extend the example above:

    /(?[ ( \p{Thai} + \p{Lao} ) & \p{Digit} ])/

This matches digits that are in either the Thai or Laotian scripts.

Notice the white space in these examples. This construct always has the /x modifier turned on.

The available binary operators are:

    & intersection
    + union

    | another name for '+', hence means union

    - subtraction (the result matches the set consisting of those
    code points matched by the first operand, excluding any that
    are also matched by the second operand)

    ^ symmetric difference (the union minus the intersection). This
    is like an exclusive or, in that the result is the set of code
    points that are matched by either, but not both, of the
    operands.

There is one unary operator:

    ! complement

All the binary operators left associate, and are of equal precedence. The unary operator right associates,
and has higher precedence. Use parentheses to override the default associations. Some feedback we've received
indicates a desire for intersection to have higher precedence than union. This is something that feedback from
the field may cause us to change in future releases; you may want to parenthesize copiously to avoid such
changes affecting your code, until this feature is no longer considered experimental.

以下略

へー、'&' だけじゃなくて '+', '-', '^' もあるのね。'^' の活用法が思いつかないけど (きっとすごいのがあるのだろう)。

実は中学生のときに、なにかの記念だかで川上さんが通っていた中学校に講演に来たことがあったのよね。体育館に書も(かなりでかい)飾ってたと思うんだけどあれは今でもあるのかなあ。【訃報】「打撃の神様」と呼ばれた巨人V9達成時の監督・川上哲治さん死去 - GIGAZINE で、V9以降、ジャイアンツは連覇したことがないというのを最近知ってちょっとびっくり。プロ野球歴代優勝チームなるほどリーグ優勝では連続があっても日本シリーズでは勝ってない。と。しかし82年から94年までのライオンズすげー。

■_ Toyota's killer firmware

reddit, HN 両方で盛り上がり(なんかHNが反応しないけど) Toyota's killer firmware: Bad design and its consequences : programming https://news.ycombinator.com/item?id=6636811 で元記事。 Toyota's killer firmware: Bad design and its consequences | EDN

Toyota's killer firmware: Bad design and its consequences | EDN

On Thursday October 24, 2013, an Oklahoma court ruled against Toyota in a case of unintended acceleration that
lead to the death of one the occupants. Central to the trial was the Engine Control Module's (ECM) firmware.

Embedded software used to be low-level code we'd bang together using C or assembler. These days, even a
relatively straightforward, albeit critical, task like throttle control is likely to use a sophisticated RTOS
and tens of thousands of lines of code.

(略)

The ECM software formed the core of the technical investigation. What follows is a list of the key findings.

Mirroring (where key data is written to redundant variables) was not always done. This gains extra significance
in light of …

Stack overflow. Toyota claimed only 41% of the allocated stack space was being used. Barr's investigation showed
that 94% was closer to the truth. On top of that, stack-killing, MISRA-C rule-violating recursion was found in
the code, and the CPU doesn't incorporate memory protection to guard against stack overflow.

Two key items were not mirrored: The RTOS' critical internal data structures; and—the most important bytes of
all, the final result of all this firmware—the TargetThrottleAngle global variable.

Although Toyota had performed a stack analysis, Barr concluded the automaker had completely botched it. Toyota
missed some of the calls made via pointer, missed stack usage by library and assembly functions (about 350 in
total), and missed RTOS use during task switching. They also failed to perform run-time stack monitoring.

Toyota's ETCS used a version of OSEK, which is an automotive standard RTOS API. For some reason, though, the
CPU vendor-supplied version was not certified compliant.

略

元記事の文章にあるリンク先 Acceleration Case: Jury Finds Toyota Liable | EE Times も見ると数年前の北米でのあの一連の騒動に関する話と。 ↑には制御ソフトウェアに色々問題があるということなんだけど、どこだかがソースコード検査して問題ないという結論だったような?

Acceleration Case: Jury Finds Toyota Liable | EE Times

It's important to note, however, that Toyota's electronics throttle control system had already been the subject
of a NASA investigation that reportedly found no electronic causes of unintended acceleration. After the US
space agency's 10-month investigation, the National Highway Traffic Safety Administration closed its probe of
Toyota models in February 2011.

あ、これだ。

Acceleration Case: Jury Finds Toyota Liable | EE Times

But not everyone in the embedded systems industry thinks NASA had enough time to come up with a complete report.
Perhaps more significantly, in its report, NASA itself did not rule out the possibility of software having
caused unintended acceleration.

    The NESC team identified two hypothetical ETSC-i failure mode scenarios (as opposed to non-electronics pedal
    problems caused by sticking accelerator pedal, floor mat entrapment, or operator misapplication) that could
    lead to [an unintended acceleration] without generating a diagnostic trouble code (DTC): specific dual
    failures in the pedal position sensing system and a systematic software malfunction in the main central
    processor unit (CPU) that is not detected by the monitor system...

    The second postulated scenario is a systematic software malfunction in the Main CPU that opens the throttle
    without operator action and continues to properly control fuel injection and ignition...

    Because proof that the ETSC-i caused the reported UAs was not found does not mean it could not occur.

ふむ。

■_

CVE-2013-0242 in Ubuntu のつづき。まず、どんな変更がされたのかをみてみると

 
 static reg_errcode_t
 internal_function __attribute_warn_unused_result__
-extend_buffers (re_match_context_t *mctx)
+extend_buffers (re_match_context_t *mctx, int min_len)
 {
   reg_errcode_t ret;
   re_string_t *pstr = &mctx->input;
@@ -4111,8 +4111,10 @@ extend_buffers (re_match_context_t *mctx)
   if (BE (INT_MAX / 2 / sizeof (re_dfastate_t *) <= pstr->bufs_len, 0))
     return REG_ESPACE;
 
-  /* Double the lengthes of the buffers.  */
-  ret = re_string_realloc_buffers (pstr, MIN (pstr->len, pstr->bufs_len * 2));
+  /* Double the lengthes of the buffers, but allocate at least MIN_LEN.  */
+  ret = re_string_realloc_buffers (pstr,
+				   MAX (min_len,
+					MIN (pstr->len, pstr->bufs_len * 2)));
   if (BE (ret != REG_NOERROR, 0))
     return ret;

ひょっとして整数オーバーフロー? が、それはそれとして、

Andreas Schwab - [PATCH] Fix buffer overrun in regexp matcher

+static int
+do_test (void)
+{
+  struct re_pattern_buffer r;
+  /* ááááááááx */
+  const char *s = "\xe1\x80\x80\xe1\x80\xbb\xe1\x80\xbd\xe1\x80\x94\xe1\x80\xba\xe1\x80\xaf\xe1\x80\x95\xe1\x80\xbax";
+
+  if (setlocale (LC_ALL, "en_US.UTF-8") == NULL)
+    {
+      puts ("setlocale failed");
+      return 1;
+    }
+  memset (&r, 0, sizeof (r));
+
+  re_compile_pattern ("[^x]x", 5, &r);
+  /* This was triggering a buffer overflow.  */
+  re_search (&r, s, strlen (s), 0, strlen (s), 0);
+  return 0;
+}
+
+#define TEST_FUNCTION do_test ()
+#include "../test-skeleton.c"

このパターンでバグによる誤動作が引き起こされる流れがよくわからん。 incomplete characters 云々とかいうのが↓にあるけど \xe1\x80\x80… の文字列ってそうは見えない気が(勘違い?)

Carlos O'Donell - Re: [PATCH] Fix buffer overrun in regexp matcher

Carlos O'Donell - Re: [PATCH] Fix buffer overrun in regexp matcher

This comment hasn't been true since MIN() was added by:
~~~
commit 8887a920a4b81a500f54893250085e0d1a52cf9a
Author: Ulrich Drepper 
Date:   Sat May 28 17:14:30 2011 -0400

    Fix unnecessary overallocation due to incomplete character
    
    When incomplete characters are found at the end of a string the
    code ran amok and allocated lots of memory.  Stricter limits
    are now in place.
~~~

今の glibc の regex ってあまり内部動作を把握できてないんだよな。もうちょっと追いかけよう。

■_

2013年10月29日

■_

いしいひさいちの作品に「バイトくん」てのがあるんですが、その登場人物(の一部)が作っている組織に「安下宿共闘会議」がございまして。その連中がデモをしたときの「我々はなるべく戦うぞー」というシュプレヒコールしているコマが妙にお気に入りなのです。オチはありません。

第三回きてた The Haskell Cast

■_ その1

本の虫: ネタバレ注意：歌舞伎座.tech#2で使うスライド資料で、二進リテラルのところに C++14の新機能すでに独自拡張として、GCC, Clang, Digital Mars C++で実装 Java 7、Python、Dでも、同じ文法で提供

なぜその三つなのかと。いやまあ馴染みがあったのがそれってことなんでしょうけど、もっと古株があるのになあ。

Perlが最初だったと記憶している http://j.mp/74EthX <@yukihiro_matz: http://j.mp/59OjR7 Java7。整数の _ とか 0b による2進リテラルとか、Rubyからではないんだろうなあ。
— Dan Kogai (@dankogai) November 25, 2009

Perl も4のときにはなかったんだっけ?(覚えてない) 日本語 perl texinfo - Table of Contents X68000用にカスタマイズされたgccで 0bサポートしてたのがあったはずだけどあれはどこから持ってきたのかなあ。

Python は 2.6から? syntax - How do you express binary literals in Python? - Stack Overflow Starting with Python 2.6 you can express binary literals using the prefix 0b or 0B:

二進じゃなくて十六進リテラルの表現思った以上にバリエーションがあってびっくりw Hexadecimal - Wikipedia, the free encyclopedia

■_ その2

RubyとPythonの違いからガベージコレクタを理解する - ワザノバ | wazanova.jp これ、ちょっと気になるところがあったんで元記事参照したんだけど、「抄訳」っていっていいくらい元記事から減量してたのね。 Visualizing Garbage Collection in Ruby and Python - Pat Shaughnessy あと元記事の作者さん Ruby Under a Microscope の著者だったんだー。

しかし元記事まで追いかけてる人いなさそう? はてなブックマーク - RubyとPythonの違いからガベージコレクタを理解する - ワザノバ | wazanova.jp Twitter / 検索 - http://wazanova.jp/post/65317231718/ruby-python

implement を「インプリ」としちゃうのはとても気になるのだけどそれはまあおいといて。

RubyとPythonの違いからガベージコレクタを理解する - ワザノバ | wazanova.jp

ガベージコレクタは、「ゴミを集める」という行為だけでなく、「新しいオブジェクトのためにメモリをあてがう。」
「不要なオブジェクトを見つける」「不要なオブジェクトからメモリを取り戻す。」という、人間の心臓が血液を浄化
するような働きをしている。

心臓は血液の浄化をしないよねというのははてダのコメントの中にもありましたが元記事の該当部分を見ると

Visualizing Garbage Collection in Ruby and Python - Pat Shaughnessy

The Beating Heart of Your Application

GC systems do much more than just “collect garbage.” In fact, they perform three important tasks. They

    allocate memory for new objects,
    identify garbage objects, and
    reclaim memory from garbage objects.

Imagine if your application was a human body: All of the elegant code you write, your business logic, your
algorithms, would be the brain or the intelligence inside the application. Following this analogy, what part
of the body do you think the garbage collector would be?
[ I got lots of fun answers from the RuPy audience: kidneys, white blood cells :) ]

I think the garbage collector is the beating heart of your application. Just as your heart provides blood and
nutrients to the rest of the your body, the garbage collector provides memory and objects for your application
to use. If your heart stopped beating you would die in seconds. If the garbage collector stopped or ran slowly
－ if it had clogged arteries － your application would slow down and eventually die!

なーんかいきなり分量が…w んーと、どこに「心臓が血液を浄化」にあたる文が? Just as your heart provides blood and nutrients to the rest of the your body が流れ的には該当するんだろうけど、「boold and nutrients(血液と栄養素)」を「供給」だよねえ。これ。

この際だから後続の部分も。あ、元記事の画像は無視してるのでその辺よろしく。

1) Rubyのメモリ

Rubyは、コードが実行される前に、数千のオブジェクトを先につくり、それをリンクされたfree listに置く。[図] そして、
上記のコードサンプルにあるNode.new (1)がコールされると、オブジェクトを一つfree listから取って渡してくれる = コ
ードで利用されるアクティブなオブジェクトになる。[図]（もちろん実際には他の役割を担うオブジェクトもあって、もっと
複雑だが、話しをシンプルにするためにこの図の考え方を使って説明している。）

次にNode.newが再びコールされると、二つ目のオブジェクトをfree listから渡してくれる。 [図]　このMRIの仕組みは、
1960年にJohn McCarthyがLispの開発の過程でつくったアルゴリズムを用いている。

The Free List

When we call Node.new(1) above, what does Ruby do, exactly? How does Ruby go about creating a new object for us?

Surprisingly, it does very little! In fact, long before your code starts to run, Ruby creates thousands of
objects ahead of time and places them on a linked list, called the free list. Here’s what the free list might
look like, conceptually:

Imagine each of the while squares above is an unused, precreated Ruby object. When we call Node.new, Ruby
simply takes one of these objects and hands it to us:

In the diagram above, the gray square on the left represents an active Ruby object we’re using in our code,
while the remaining white squares are unused objects. [ Note: of course, my diagrams are a simplified version
of reality. In fact, Ruby would use another object to hold the string “ABC,” a third object to hold the class
definition of Node, and still other objects to hold the parsed, abstract syntax tree (AST) representation of my
code, etc. ]

This simple algorithm of using a linked list of precreated objects was invented over 50 years ago by a legendary
computer scientist named John McCarthy, while he was working on the original implementation of Lisp. Lisp was
not only one of the first functional programming languages, but also contained a number of other groundbreaking
advances in computer science. One of these was the concept of automatically managing your application’s memory
using garbage collection.

The standard version of Ruby, also known as “Matz’s Ruby Interpreter” (MRI), uses a GC algorithm similar to
the one used by McCarthy’s implementation of Lisp in 1960. For better or worse, Ruby uses a 53 year old
algorithm for garbage collection. Just as Lisp did, Ruby creates objects ahead of time and hands them to your
code when you allocate new objects or values.

long before your code starts to run, Ruby creates thousands of objects ahead of time and places them on a linked list, called the free list. が、 Rubyは、コードが実行される前に、数千のオブジェクトを先につくり、それをリンクされたfree listに置く。こうなる? The standard version of Ruby, also known as “Matz’s Ruby Interpreter” (MRI), uses a GC algorithm similar to the one used by McCarthy’s implementation of Lisp in 1960. がこのMRIの仕組みは、1960年にJohn McCarthyがLispの開発の過程でつくったアルゴリズムを用いている。というのは、元記事で“Matz’s Ruby Interpreter” (MRI)って書いてるのに、MRI だけ書いちゃあ特に Ruby を知らない人にはなんだそれという状態になってしまうだろうし、原文の similar to がどっか行っちゃってるし、逆に「(Lisp の)開発の過程」ってのはどこから来たの?

2) Pythonのメモリ

Pythonも、リストのためにオブジェクトを再利用するのでfree listの仕組みを内部ではもっているが、通常はRubyとは違う
メモリの扱いをする。PythonはオブジェクトをつくったらすぐにOSにメモリを要求する。[図]（Pythonは実際、OSヒープ上
に追加の抽象化レイヤをつくるメモリ適用システムをインプリするが、今回はその詳細の説明は割愛。）二つ目のオブジェ
クトをつくる場合も同様に、OSにメモリを要求する。[図]

What about Python?

While Python also uses free lists for various reasons internally (it recycles certain objects such as lists),
it normally allocates memory for new objects and values differently than Ruby does.

Suppose we create a Node object using Python:

Python, unlike Ruby, will ask the operating system for memory immediately when you create the object. (Python
actually implements its own memory allocation system which provides an additional layer of abstraction on top
of the OS heap. But I don’t have time to get into those details today.)

When we create a second object, Python will again ask the OS for more memory:

Pythonは実際、OSヒープ上に追加の抽象化レイヤをつくるメモリ適用システムをインプリするが、今回はその詳細の説明は割愛。 its own memory allocation system allocation と application を勘違いした? ＞「メモリ適用システム」

3) Rubyは未利用のオブジェクトを放置

Rubyでオブジェクトが次々つくられると、free listの残りが少なくなる。[図] n1に新しいバリューがアサインされると、
古いバリューが残ったオブジェクトが放置されていることに注目してほしい。[図]

Ruby leaves unused objects lying around in memory until the next GC process runs.

Seems simple enough; at the moment we create an object Python takes the time to find and allocate memory for us.

Ruby Developers Live in a Messy House

Back to Ruby. As we allocate more and more objects, Ruby will continue to hand us precreated objects from the
free list. As it does this, the free list will become shorter:

…and shorter:

Notice as I continue to assign new values to n1, Ruby leaves the old values behind. The ABC, JKL and MNO nodes
remain in memory. Ruby doesn’t immediately clean up old objects my code is no longer using! Working as a Ruby
developer is like living in a messy house, with clothes lying on the floor or dirty dishes in the kitchen sink.
As a Ruby developer you have to work with unused, garbage objects surrounding you.

unused は「未使用」じゃないかなあ。「バリューがアサイン」…うーん。

4) Pythonは未利用のオブジェクトを掃除

PythonではオブジェクトのC構造の内部に参照カウントという整数をもち、初期数値は１。1という数字は一つのポインターが
指す、参照されているという意味。[図] 新しいnodeがつくられて、元のnodeが不要になると、そこの参照カウントがゼロに
なる。[図] すぐにPythonはそのメモリをOSに戻す。[図] 参照カウンターは、同じ1960年にGeorgeCollinsが生み出したアル
ゴリズム。次に、n2がn1と同じnodeを参照すると、従前にn2が参照していたnodeは掃除される。 [図]

Python Developers Live in a Tidy Household

Python cleans up garbage objects immediately after your code is done using them.

Garbage collection works quite differently in Python than in Ruby. Let’s return to our three Python Node
objects from earlier:

Internally, whenever we create an object Python saves an integer inside the object’s C structure, called the
reference count. Initially, Python sets this value to 1:

The value of 1 indicates there is one pointer or reference to each of the three objects. Now suppose we create
a new node, JKL:

Just as before, Python sets the reference count in JKL to be 1. However, also notice since we changed n1 to
point to JKL, it no longer references ABC, and that Python decremented its reference count down to 0.

At this point, the Python garbage collector immediately jumps into action! Whenever an object’s reference
count reaches zero, Python immediately frees it, returning it’s memory to the operating system:

Above Python reclaims the memory used by the ABC node. Remember, Ruby simply leaves old objects lying around
and doesn’t release their memory.

This garbage collection algorithm is known as reference counting. It was invented by George Collins in 1960 –
not coincidentally the same year John McCarthy invented the free list algorithm. As Mike Bernstein said in his
fantastic presentation on garbage collection at the Gotham Ruby Conference back in June: “1960 was a good year
for Garbage Collectors….”

Working as a Python developer is like living in a tidy house; you know, the kind of place where your roommates
are a bit OCD and are constantly cleaning up after you. As soon as you put down a dirty dish or glass, someone
has already put it away in the dishwasher!

Now for a second example. Suppose we set n2 to refer to the same node as n1:

Above to the left you can see Python has decremented the reference count for DEF and will immediately garbage
collect the DEF node. Also note that the JKL now has a reference count of 2, since both n1 and n2 point to it.

なんという文章量の違い。参照カウンターは、同じ1960年にGeorgeCollinsが生み出したアルゴリズム。原文を削りまくってるから「同じ」1960年と言われてもなんじゃそりゃ感が。

5) RubyのMark & Sweepアルゴリズム

Rubyのゴミが溜まり続ける構造では、いずれfree listが枯渇する。そうなるとRubyはアプリを止め、全体をループし、メモ
リがあてられているオブジェクトにマークをつける。[図] 内部的にはRubyは、マークされているか、そうでないかをfree
bitmapで管理している。 [図] Rubyは、unix copy-on-writeの最適化を利用するためbitmapを別のメモリ場所に保管している。

マークされていないオブジェクトは掃除される。[図] Rubyは、一連の作業においてオブジェクトをコピーせずに、内部のポ
インターを調整して新しいlink listを作成することで、オブジェクトをリストに返しているので、作業はかなり短い時間で
完了する。

Mark and Sweep

Eventually a messy house fills up with trash and life can’t continue as usual. After your Ruby program runs for
some time, the free list will eventually be entirely used up:

Here all of the precreated Ruby objects have been used by our application (they are all gray) and no objects
remain on the free list (no white squares are left).

At this point Ruby uses another algorithm invented by McCarthy known as Mark and Sweep. First Ruby stops your
application; Ruby uses “stop the world garbage collection.” Ruby then loops through all of the pointers,
variables and other references our code makes to objects and other values. Ruby also iterates over internal
pointers used by its virtual machine. It marks each object that it is able to reach using these pointers. I
indicate these marks using the letter M here:

Above the three objects marked with “M” are live, active objects that our application is still using.
Internally, Ruby actually uses a series of bits known as the free bitmap to keep track of which objects are
marked or not:

Ruby saves the free bitmap in a separate memory location in order to take full advantage of Unix copy-on-write
optimization. For more information on this, see my article Why You Should Be Excited About Garbage Collection
in Ruby 2.0.

If the marked objects are live, the remaining, unmarked objects must be garbage, meaning they are no longer
being used by our code. I’ll show the garbage objects as white squares below:

Next Ruby sweeps the unused, garbage objects back onto the free list:

Internally this happens quite quickly, since Ruby doesn’t actually copy objects around from one place to
another. Instead, Ruby places the garbage objects back onto the free list by adjusting internal pointers to
form a new linked list.

Now Ruby can give these garbage objects back to us the next time we create a new Ruby object. In Ruby, objects
are reincarnated, and enjoy multiple lives!

わざわざ英字で「link list」はないわー。

6) Mark & Sweepと参照カウント

参照カウントをガベージコレクタのアルゴリズムに採用しない言語がある理由は、

    各オブジェクトの内部に参照カウントを置くスペースを確保したり、変数/参照を上下変化させるオペーレーションなど、
    この手法のインプリは難易度が高い。

    Pythonは頻度高く参照カウントを更新していて、例えば大きなデータ構造の利用をやめると、参照カウントの修正が一
    気に、かつかなり複雑な作業になり、アプリが遅くなる可能性がある。

    参照カウントは、巡回するデータ構造には使えない。（次回詳細説明）

Mark and Sweep vs. Reference Counting

At first glance, Python’s GC algorithm seems far superior to Ruby’s: why live in a messy house when you can
live in a tidy one? Why does Ruby force your application to stop running periodically each time it cleans up,
instead of using Python’s algorithm?

Reference counting isn’t as simple as it seems at first glance, however. There are a number of reasons why many
languages don’t use a reference counting GC algorithm like Python does:

First, it’s difficult to implement. Python has to leave room inside of each object to hold the reference
count. There’s a minor space penalty for this. But worse, a simple operation such a changing a variable or
reference becomes a more complex operation since Python needs to increment one counter, decrement another,
and possibly free the object.

Second, it can be slower. Although Python performs GC work smoothly as your application runs (cleaning dirty
dishes as soon as you put them in the sink), this isn’t necessarily faster. Python is constantly updating
the reference count values. And when you stop using a large data structure, such as a list containing many
elements, Python might have to free many objects all at once. Decrementing reference counts can be a complex,
recursive process.

Finally, it doesn’t always work. As we’ll see in my next post containing my notes from the rest of this
presentation, reference counting can’t handle cyclic data structures – data structures that contain
circular references.

変数/参照を上下変化させるオペーレーションとは? あと、cyclic data structures は「循環」(している)データ構造じゃないかなあ。 Cyclic Redundancy Check では「巡回冗長検査」だけど。

良い記事見つけて書いてるんだからもっと頑張って欲しいなあ。 Go言語で苦労したポイントの事例 - ワザノバ | wazanova.jp Spotify: 大きな障害の予兆となる小さな障害 [Postmortem7] - ワザノバ | wazanova.jp あ、他の記事の元記事との突き合わせはやってません。

■_ CVE

概ね状況が把握できたんだけどまだちょっとよくわからないところが CVE-2013-0242 in Ubuntu

■_

間違い(と思われる箇所)の指摘を本人に直接せずにこーゆーところでぐちぐち書くあたり相当救われんな。わしw

2013年10月28日

■_

終了時間がなあ…ってもう一杯かい!w 歌舞伎座.tech#2 - connpass

今回の「情報技術者の社会的責任」のレポート課題がおもしろげ(やらないけど)。

情報技術者の社会的責任第6話 from Hiroki Kashiwazaki

■_

ぺちぺ。全く目新しい主張というわけでもないと思う(ざっと見ただけなので見落とした可能性大)

Why PHP is for Real | Acquia

PHP has evolved from its humble beginnings as the lingua franca for script kiddies, into a “true” programming
language. Nowadays, huge numbers of businesses and organizations of every size depend on code written in PHP.

When you look at the top 10 of most popular programming languages, PHP holds the number 5 position. When we
focus on languages used for websites and web applications, PHP is number one. Did you know that 3 PHP developers
are listed in the top 10 of most active GitHub contributors?

In this article, I want to touch on a few reasons why PHP is so popular, and why it may be the best bet for your
business or application:

説明部分はすっ飛ばして結論

Why PHP is for Real | Acquia

Conclusion

PHP is a great programming language for the web and is the right tool for the job more often than not. Its
broad community and adoption for mission critical use at scale; rich ecosystems of tools, hosting, and
frameworks; and powerful CMS’s like Drupal all make PHP an essential part of today’s web.

わしにはよくわからん(使ってないから)

■_

2013年10月25日号　Ubuntu 14.04 LTS “Trusty Tahr”・Ubuntu 13.10 日本語RemixのRC・UWN#339：Ubuntu Weekly Topics｜gihyo.jp … 技術評論社にあった

2013年10月25日号　Ubuntu 14.04 LTS “Trusty Tahr”・Ubuntu 13.10 日本語RemixのRC・UWN#339：Ubuntu Weekly Topics｜gihyo.jp … 技術評論社

usn-1991-1：GNU C Libraryのセキュリティアップデート

        https://lists.ubuntu.com/archives/ubuntu-security-announce/2013-October/002279.html
        Ubuntu 13.04・12.10・12.04 LTS・10.04 LTS用のアップデータがリリースされています。CVE-2012-4412, CVE-2012-4424, CVE-2013-0242, CVE-2013-1914, CVE-2013-4237, CVE-2013-4332を修正します。
        libcのstrcoll()・正規表現におけるマルチバイト処理・getaddrinfo・readdir_r()に潜在するDoS・メモリ破壊の可能性のある脆弱性を修正します。
        対処方法：アップデータを適用の上，システムを再起動してください。

がちょっと気になったんだけど

[USN-1991-1] GNU C Library vulnerabilities

Details:

It was discovered that the GNU C Library incorrectly handled the strcoll()
function. An attacker could use this issue to cause a denial of service, or
possibly execute arbitrary code. (CVE-2012-4412, CVE-2012-4424)

It was discovered that the GNU C Library incorrectly handled multibyte
characters in the regular expression matcher. An attacker could use this
issue to cause a denial of service. (CVE-2013-0242)

It was discovered that the GNU C Library incorrectly handled large numbers
of domain conversion results in the getaddrinfo() function. An attacker
could use this issue to cause a denial of service. (CVE-2013-1914)

It was discovered that the GNU C Library readdir_r() function incorrectly
handled crafted NTFS or CIFS images. An attacker could use this issue to
cause a denial of service, or possibly execute arbitrary code.
(CVE-2013-4237)

It was discovered that the GNU C Library incorrectly handled memory
allocation. An attacker could use this issue to cause a denial of service.
(CVE-2013-4332)

CVE-2013-0242 とかの変更箇所を手っ取り早く確認するにはどうすればいいんだろ。この辺か USN-1991-1: GNU C Library vulnerabilities | Ubuntu → CVE-2013-0242 in Ubuntu → Bug 15078 – regex crash on myanmar script → Andreas Schwab - [PATCH] Fix buffer overrun in regexp matcher

■_

2013年10月27日

■_

色々ダメ。

神保町の古本市に行ってきた。財布の中身に余裕があったら買ってたであろう本もあったんだけどねえ…○|￣|＿岩波のあのシリーズの全巻揃いとか。スプリンガーのあるシリーズ三巻全巻揃いとか。

■_ これ

(?[ ]) って新しい構文? だよねえ。ブラケットの中で集合演算(&だけ?) してるっぽいけど。

“Unicode Programming in Modern Perl” slides for the Internationalization & Unicode Conference: https://t.co/4rQSVVTi9T #iuc37 #unicode #perl
— Nick Patch (@nickpatch) October 22, 2013

Unicode Programming in Modern Perl // Speaker Deck

うまく貼れたかな?

■_

近刊でまた面白そうなの発見。 Python Data Visualization Cookbook
Clojure for Domainspecific Languages

で、出版社を調べてみると(前にもここ見たことあるなあ) Python Data Visualization Cookbook | Packt Publishing Clojure for Domain-specific Languages | Packt Publishing Also available on: O'Reilly ということでちょい様子見。

■_

「1.01の法則」のウソ ~ If it’s not fun, why do it? ~ - コピーライターの目のつけどころ(ダークサイド)
1分1秒を争う障害対応のためのリーダブルコード
「無用の用」と「不易流行」
(web archive) ＳＦ設定という仕事　金子隆一＆小林伸光インタビュー
東京 Node 学園祭 2013 に行ってきた #nodefest - Kato Kazuyoshi
東急ハンズのロゴに描かれた"手"には深い意味があった! -担当者に聞いてみた | マイナビニュース
コンピュータ初心者が学ぶのに適したプログラミング言語は？：まあまあ元気になる話：ITmedia オルタナティブ・ブログ
俺の被害妄想でrailsが死ぬ時 - komagata
本の虫: C++11参考書を公開した後の予定
本の虫: 日本語のC++参考書の行く末
BlackBerryを救う方法 | TechCrunch Japan
BlackBerryの命は風前の灯だが、自由という名の炎の洗礼を受ける気があるなら、まだ望みはある。もう、こうなったら、怖いもの・失うものは何もないのだから、今土壇場のBlackBerryは、モバイルの業界全体と世界の政府をディスラプトするための、度胸を持てるだろう。
BlackBerry is a flickering candle about to be snuffed, but hope yet lies in the baptismal flame of liberty. With nothing left to lose, perhaps BlackBerry will have the courage to disrupt its competitors and world governments.

2013年10月26日

■_

近くにミニストップないんだよなあミニストップで野帳が売っている件について - phaの日記

あ、「パターン・ランゲージ」もうでてたのか。

10/22新刊：ISBN978-4-7664-1987-0 慶応義塾出版会『パターン・ランゲージ創造的な未来をつくるための言語』井庭崇編著中埜博、江渡浩一郎、中西泰人、竹中平蔵、羽生田栄一著 30冊入荷
— ジュンク堂書店池袋本店/PC書 (@junkudo_ike_pc) October 22, 2013

こっちもちょっと気になる。

10/26新刊：ISBN978-4-86100-889-4 ビー・エヌ・エヌ新社『ほんとに使える「ユーザビリティ」より良いデザインへのシンプルなアプローチ』エリック・ライス著　浅野紀予訳　40冊入荷
— ジュンク堂書店池袋本店/PC書 (@junkudo_ike_pc) October 26, 2013

■_

これ Amazon.co.jp： Python Swallowed Whole: Core Developers Define Python: Steve Holden: 洋書発売日： 2016/1/23 気が早いというかなんというか。

■_ Contract-based programming

ちょっと前の記事ですが Contract-based programming: making software more reliable | Embedded

Contract-based programming: making software more reliable | Embedded

(略)

Operations in Ada are parameterizable subprograms (procedures or functions), and preconditions and
postconditions are manifested through Boolean expressions that are associated with the subprogram’s
declaration. Here is an example of a simple function that computes the maximum in an array of Float values,
with pre- and postconditions establishing the function’s contract:

  type Float_Array is array (Integer range <>) of Float;
  -- Different objects may have different lengths
 
  function Max( A : Float_Array ) return Float
  with
    Pre => A'Length > 0,
    Post => (for all F of A => F <= Max'Result) and
            (for some F of A => F = Max'Result);
 
  function Max( A : Float_Array ) return Float is
  begin
    … -- Algorithm to compute max value
  end Max;

  A1 : Float_Array(1..5) := (-10.0, 20.34, -123.45, 0.0, 0.0);
  F1 : Float := Max(A1); -- 20.34
  A2 : Float_Array(1..0); -- No elements in A2
  F2 : Float := Max(A2): -- Precondition violation, since A2'Length=0

Integer range <> ってなんじゃろげ

■_

無気力ー

2013年10月25日

■_

「ブランチを切る」とかいうじゃないですか。 git とかその辺で。ふと、英語でなんというのかなあと気になったわけですよ。 cut branch とかじゃないのー(切る→cut)という意見があったんですが、ぐぐるさんで検索してみると… branch を cut するって、不要になったブランチを削除するとか言った意味で使ってるっぽいんですが… (たぶん続かない)

■_ 大人買い

こういうのは「大人買い」とはちょっと違うような気もするけどこだわりません。 O'Reilly Japan - オライリー・ジャパンの在庫書籍全点を導入の猛者あらわる！ - Information from O'Reilly Japan 某社の状況と比較すると以下略。それはさておきオライリー・ジャパンの現行在庫書籍すべてを大人買いした企業現る | スラッシュドット・ジャパン IT

コメント#2484287 | オライリー・ジャパンの現行在庫書籍すべてを大人買いした企業現る | スラッシュドット・ジャパン

補足。
カタログを元に計算してみました。

http://www.oreilly.co.jp/catalog/ [oreilly.co.jp]
398冊（ロゴ入りバックパックは省いた）掲載されています。
価格を単純に足すと、136万3593円。

今回の大人買いは、現在稼働しているもの約350点ということですので、
やはり1セットあたり100万円ちょっとで済みそうですね。

ひゃくさんじうろくまん…。平均値は 3400円ちょっとですか。単純平均なのでアレですがまあイメージしてるのと大差はない?

■_ for

そういや自分もちょっとだけ調べたことがあるような覚えが

ループを表す構文の名前が「for」なのは英語話者でない私にはよくわからないのだが、多くの言語で採用されているからにはきっと多くのユーザにとって直感的に納得できるものなのだろう。
— (32) 齊藤敦志 (@SaitoAtsushi) October 25, 2013

いや for each を for に略しちゃうセンスは特殊だと思うぞ
— がちゃぴん先生 (@kosaki55tea) October 25, 2013

@kosaki55tea forキーワードが初めて導入された時代のループは初期値とステップと終値を指定するタイプだから、each感はない気がする。each感が出たのはコレクションの列挙という抽象化を経てからじゃないかなあ
— Akinori MUSHA (@knu) October 25, 2013

@knu 配列をfor eachするための手段だと考えれば違和感ないですよ。というか for でループを連想させるような使い方って英語でほかにありますっけ？
— がちゃぴん先生 (@kosaki55tea) October 25, 2013

@kosaki55tea むしろ配列をなめるためにループ変数を回すのが伝統的for文で、forの主役はループ変数（その場合添え字・インデックス）ってことでiterate for ループ変数だったんだと思いました
— Akinori MUSHA (@knu) October 25, 2013

@knu iterate for として捉えるという考え方は意外というか新鮮でした。iterateという概念にループが refineされたのはもっと後の時代という認識だったので
— がちゃぴん先生 (@kosaki55tea) October 25, 2013

@kosaki55tea @knu ALGOL が FOR n FROM 1 TO 3 とかそういう文法なので iterate for の略とも for each の略ともどちらとも取れるかと
— Kazuho Oku (@kazuho) October 25, 2013

@knu @kosaki55tea fortranとかのforは等差数列を列挙して中のブロックで使えるだけで、配列とは直接関係ないイメージだったなあ
— Akihiko Koizuka (@koizuka) October 25, 2013

@kazuho @kosaki55tea はい。単にforで意味が通じるから短くforにして、後世コレクションやリストの列挙みたいな抽象化があったけど、わざわざeachを付けるまでもなくforを使い回したんじゃないかなと想像してます
— Akinori MUSHA (@knu) October 25, 2013

@kazuho @knu iterate for でどっちか省略するとしたら普通、前置詞の方を削るだろ、みたいな感覚なんですよね。その発想だと意味を保持してないforのほうを残すセンスは for each の非じゃなくキチってる印象。まあ証拠はなにもなくてただの印象なんですけど
— がちゃぴん先生 (@kosaki55tea) October 25, 2013

@kosaki55tea @knu がらっと意見をかえますが、おそらく最初にFORを導入したALGOLだと、do something for ... （の倒置）なので、for each でも for でも英語的におかしくないと思います
— Kazuho Oku (@kazuho) October 25, 2013

do something for a, b, and c でも do something for each of a, b, and c でも英語的には通じるよねという
— Kazuho Oku (@kazuho) October 25, 2013

@grove_twtr こんにちは。それ書いたの私ですが「誰かそんなこと明示的に言ったっけ？」と焦っております(==;
— ささきしげお (@SigSasaki) October 25, 2013

@grove_twtr ループ構文に for を使うのは Algol が最初（←確か事実）； Algolのfor文には \sum_{i=1} と同じに読めるものがある（←これも事実）； Algolの for文の由来は ¥sum_{i=1} が起源（←どこに書いてる？）
— ささきしげお (@SigSasaki) October 25, 2013

■_

で、アレ。 Twitter / _shimada: "米国でこの8年間のJava案件数の増加率はほぼ0％で、PH ... PHP, Java, Ruby Job Trends | Indeed.com を見ると確かに Ruby の線がずっと上の方に

PHP, Java, Ruby Job Trends

PHP jobs - Java jobs - Ruby jobs

で、ふと気になって absolute で表示させてみたんですが、これは…w

PHP, Java, Ruby Job Trends

PHP jobs - Java jobs - Ruby jobs

で、Ruby を抜いて absolute。

PHP, Java Job Trends

PHP jobs - Java jobs

やっぱあの人は(ry

元記事のコメント欄でもツッコミ入ってて、それに対する返事がもうね

■_

2013年10月24日

■_

マグマとか半群とかモノイドとかが頭の中でぐーるぐる。

あー、結局書いてないなあ

9/28 に @kakutani さんとジュンク堂で行ったトークセッションの podcast と動画(youtube)が公開されました。よろしくお願いします! / “角谷信太郎 × 和田卓人ピアソン技術書のゼロ年代と俺たち ―"…” http://t.co/tNVbI3nRJE
— Takuto Wada (@t_wada) October 24, 2013

■_ nfu

面白げ。USP ではこういうのを機能ごとに作っていたような。

nfu: Command-line Numeric Fu - Factual Blog

We often use the UNIX command line for ad-hoc data crunching. Most of the time we have the good sense to use
a better tool after the first 100 characters or so, but sometimes we’ll just blow past the right margin with a
string of sort, uniq -c, sort -nr, cut -f1, and other “glue” commands. To make this easier, I decided to
bundle a bunch of common ones up into a Perl script called nfu.

The idea behind nfu is to save as much command-line real estate as possible for simple command-line data
analysis. It’s designed to wrap or replace a bunch of filter processes like sort, uniq, and in many cases, awk
and perl, by providing a series of composable operators designed to operate on rows of whitespace
column-delimited text input. For example, two such operators are “sum” and “delta”:

$ seq 4 | nfu -s       # or nfu --sum
1
3
6
10
$ seq 4 | nfu -d       # or nfu --delta
1
1
1
1
$

Operators compose by juxtaposition (as described in further detail)

$ seq 4 | nfu -ss
1
4
10
20
$

以下略

©2013 Factual Inc., All Rights Reserved

色々多彩なことができるみたい。上記の例の -ss でやってることがぱっとみわからなかったんだけど (Operators compose by juxtaposition といわれましても…)、

  1 2  3  4
  1 3  6 10
  1 4 10 20

なるほど。

spencertipping/nfu
nfu is desgined to do a bunch of common/useful numeric tasks to text-oriented data. For example, suppose you
want to look at the cumulative distribution of words in a text file, ordered by most common first. In plain
shell, you'd probably write something like this:

Here's what you'd say with nfu:

$ egrep -o '\w+' file | nfu -gcOsf0p 'with lines'

g = "group", which sorts things
c = "count", which means uniq -c
O = "reverse order", which means sort -rn
s = "sum"
f0 = "field 0", which is what awk calls $1
p = "plot", which uses gnuplot and croaks if you don't have it

(略)

Commands

nfu chains commands together just like a shell pipeline. This means that order matters; nfu -sc and nfu -cs do
two completely different things.

-a, --average: Generates a running average of the last N elements. If N = 0 or is not provided, then
generates a running average of all numbers.
最後 N 個の要素の移動平均を求める。N が 0、もしくはN が与えられなかった場合には全要素に
ついての平均を求める。

-c, --count: Pipes data through uniq -c to count adjacent, equivalent items. You should probably use -g
before this unless your data is already grouped or you just want run lengths.
uniq -c にデータをパイプ経由で送り等しいアイテムの個数を数える。すでにデータが
ソートされている場合や単に run length を求めたいのでなければ -g を事前に使うべきである。

-d, --delta: The inverse of sum; returns the difference between successive numbers.
連続する二つの数値の差を返す

-e, --eval: Allows you to transform data with a Perl expression. Individual fields are available in @_. If
you return a single value, then it replaces the first column; otherwise your data replaces all
values in the row. If you return an empty list, no output row is generated.
Perl の式を使ってデータを変形することを許可する。個々のフィールドは @_ でアクセス可能。
使用した Perl 式が一つの値しか返さない場合、その値は先頭カラムを置き換える。複数の値を返す場合
それらすべてが the row を置き換える。空リストを返した場合は output row は空となる。

-f, --fields: Allows you to reorder fields arbitrarily, outputting tab-delimited data. Takes a single string
of digits, each of which is a zero-based field index.
フィールドの任意な並び替えを許可し、その結果をタブ区切りで出力する。

-g, --group: Pipes data through sort to create groups of equivalent entries. Assumes lexicographic, not numeric, sort.
sort を使用して等価なエントリのグループを作る。

-G, --rgroup: Same as group, but reverses the sort order.

-l, --log: Log-transforms every value.
すべての数値に対してその対数をとる

-L, --exp: Exponent-transforms every value.
すべての数値に対してそのべきをとる

-o, --order: Orders elements by numeric value.

-O, --rorder: Same as order, but reverses the sort.

-p, --plot: Plots the input data as-is. You may need to reorder or slice fields to get gnuplot to work correctly.
与えられたデータをそのまま入力としてプロットする

-P, --poll: Takes an interval in seconds and a command, and runs the command forever, sleeping by the
interval between runs. You can use this to generate a stream of data.

-q, --quant: Quantize each number to the nearest x, which defaults to 1. x can be any positive value.

-s, --sum: Takes a running total of the given numbers.

-S, --slice: Takes two numbers: #lines to chop from head, #lines to chop from tail.

■_ nfu

でコマンド。案外短い。まあPerlだし。



#!/usr/bin/env perl
# nfu: Command-line numeric fu | Spencer Tipping
# Licensed under the terms of the MIT source code license

use v5.10;
use strict;
use warnings;
use POSIX qw(dup2);

$|++;

my %explosions = (
  a => '--average',
  v => '--variance',
  c => '--count',
  d => '--delta',
  e => '--eval',
  f => '--fields',
  g => '--group',
  G => '--rgroup',
  l => '--log',
  L => '--exp',
  o => '--order',
  O => '--rorder',
  s => '--sum',
  S => '--slice',
  q => '--quant',
  p => '--plot',
  P => '--poll',
);

my %arity = (
  average  => 1,
  variance => 1,
  count    => 0,
  delta    => 0,
  eval     => 1,
  fields   => 1,
  group    => 0,
  rgroup   => 0,
  log      => 0,
  exp      => 0,
  order    => 0,
  rorder   => 0,
  plot     => 1,
  poll     => 2,
  sum      => 0,
  slice    => 2,
  quant    => 1,
);

my %functions = (
  count  => sub {exec 'uniq', '-c'  or die 'failed to exec "uniq -c"'},
  group  => sub {exec 'sort'        or die 'failed to exec "sort"'},
  rgroup => sub {exec 'sort', '-r'  or die 'failed to exec "sort -r"'},
  order  => sub {exec 'sort', '-n'  or die 'failed to exec "sort -n"'},
  rorder => sub {exec 'sort', '-rn' or die 'failed to exec "sort -rn"'},

  average => sub {
    my $size = $_[0];
    my ($n, $total) = (0, 0);
    my @window = ();
    while (<STDIN>) {
      chomp;
      my ($x, @xs) = split;
      print join("\t", ($total += $x) /
                       (++$n > $size && $size ? $size : $n), @xs), "\n";
      $total -= shift @window if $size and push(@window, $x) >= $size;
    }
  },

(略)

  sum => sub {
    my $total = 0;
    while (<STDIN>) {
      chomp;
      my ($x, @xs) = split;
      print join("\t", $total += $x, @xs), "\n";
    }
  },

(略)

sub explode {
  return $_ unless s/^-([^-])/$1/;
  map {$explosions{$_} // $_} grep length, split /([.\d]*),?/;
}

$SIG{CHLD} = 'IGNORE';

my $reader   = undef;
my @exploded = map explode, @ARGV;

# Note: the loop below uses pipe/fork/dup2 instead of a more idiomatic Open2
# call. I don't have a good reason for this other than to figure out how the
# low-level stuff worked.
while (@exploded) {
  (my $command = shift @exploded) =~ s/^--//;
  my  $arity   = $arity{$command} // die "undefined command: $command";
  my  @args    = splice @exploded, 0, $arity;

  # Here's where things get fun. The question right now is, "do we need to
  # fork, or can we run in-process?" -- i.e. are we in the middle, or at the
  # end? When we're in the middle, we want to redirect STDOUT to the pipe's
  # writer and fork; otherwise we run in-process and write directly to the
  # existing STDOUT.
  if (@exploded) {
    # We're in the middle, so allocate a pipe and fork.
    pipe my($new_reader), my($writer);
    unless (fork) {
      # We're the child, so do STDOUT redirection.
      close $new_reader or die "failed to close pipe reader: $!";
      dup2(fileno($reader), 0) or die "failed to dup input: $!"
        if defined $reader;
      dup2(fileno($writer), 1) or die "failed to dup stdout: $!";

      close $reader or die "failed to close reader: $!" if defined $reader;
      close $writer or die "failed to close writer: $!";

      # The function here may never return.
      $functions{$command}->(@args);
      exit;
    } else {
      close $writer or die "failed to close pipe writer: $!";
      $reader = $new_reader;
    }
  } else {
    # We've hit the end of the chain. Preserve stdout, redirect stdin from
    # current reader.
    dup2(fileno($reader), 0) or die "failed to dup input: $!"
      if defined $reader;
    $functions{$command}->(@args);
  }
}

-ss ってやったときは-s の出力をもう一個作った -s に食わせて計算してんのね。なるほど。

■_ 0 base

Why Python uses 0-based indexing. It's because of slices | Hacker News Guido van Rossum: Why Python uses 0-based indexing : Python

Guido van Rossum - Google+ - I was asked on Twitter why Python uses 0-based indexing,…

I was asked on Twitter why Python uses 0-based indexing, with a link to a new (fascinating) post on the subject
( http://exple.tive.org/blarg/2013/10/22/citation-needed/ ). I recall thinking about it a lot; ABC, one of
Python's predecessors, used 1-based indexing, while C, the other big influence, used 0-based. My first few
programming languages (Algol, Fortran, Pascal) used 1-based or variable-based. I think that one of the issues
that helped me decide was slice notation.

twitter で、Python ではなぜ 0始まりの添え字付けを採用しているのかを新しいポストへのリンクつきで質問されました。
そこでわたしは色々と思い起こしてみました。
Python の predecessor の一つである ABC では1始まりの添え字付けを採用していました。
一方で Python に多大な影響を与えた C は 0始まりの添え字付けを採用していました。
AlgolやFortran、Pascal といったわたしの first new languages では 1始まりか variable-based でした。
この問題に関して、現在のようにわたしに決断させたのは slice の表記が理由であったと考えています。

Let's first look at use cases. Probably the most common use cases for slicing are "get the first n items"
and "get the next n items starting at i" (the first is a special case of that for i == the first index).
It would be nice if both of these could be expressed as without awkward +1 or -1 compensations.

Using 0-based indexing, half-open intervals, and suitable defaults (as Python ended up having), they are
beautiful: a[:n] and a[i:i+n]; the former is long for a[0:n].

0始まりの添え字付けを採用した場合、half-open intervals や (現状のPython のような) suitable defaluts は
a[:n] や a[i:i+n] のように綺麗に書けます。前者は a[0:n] と同じです。

Using 1-based indexing, if you want a[:n] to mean the first n elements, you either have to use closed intervals
or you can use a slice notation that uses start and length as the slice parameters. Using half-open intervals
just isn't very elegant when combined with 1-based indexing. Using closed intervals, you'd have to write
a[i:i+n-1] for the n items starting at i. So perhaps using the slice length would be more elegant with 1-based
indexing? Then you could write a[i:n]. And this is in fact what ABC did -- it used a different notation so you
could write a@i|n.(See http://homepages.cwi.nl/~steven/abc/qr.html#EXPRESSIONS )

1始まりの添え字付けを採用した場合、a[:n] という表記が先頭 n 個の要素を意味するようにさせるには
closed intervals を使うかパラメーターに開始位置と長さとるスライス表記を使えるようにしなければなりません。
1始まりの添え字付けと組み合わせた場合、half-open intervals は elegant ではないのです。
closed intervals を使うとすると、i 番目からn個の要素を指定するのには a[i:i+n-1]  と記述なければなりません。
であれば、スライスの長さを使った方が1始まりの添え字付けを採用した場合にはより elegant ではないでしょうか?
そしてそれは ABC で実際に使われていたやり方です。
it used a different notation so you could write a@i|n.(See http://homepages.cwi.nl/~steven/abc/qr.html#EXPRESSIONS )

But how does the index:length convention work out for other use cases? TBH this is where my memory gets fuzzy,
but I think I was swayed by the elegance of half-open intervals. Especially the invariant that when two slices
are adjacent, the first slice's end index is the second slice's start index is just too beautiful to ignore.
For example, suppose you split a string into three parts at indices i and j -- the parts would be a[:i], a[i:j],
and a[j:].

So that's why Python uses 0-based indexing.

■_

なーんかひっかかるんだよなあ、これ。教育ビジネスを考える。行動する。改善する。: 米国でこの8年間のJava案件数の増加率はほぼ0％で、PHPは250％増。日本でも上級PHPerがいないと売り上げは伸びない！？

■_

Yoriyuki Yamagata - Google+ - 動的言語好きの人たちは、静的型付けをプログラミングに対する制約、と捉える傾向があるように思うが、これは型付きラムダ計算…

2013年10月23日

■_

0と1の話ブール代数とシャノン理論
 最初のところで「カルテク」って出てくるんだけど、訳注入れた方が良いんじゃないかなあと思った。前後の文からカリフォルニア工科大学のことだろうってのはすぐ分かったけど。カリフォルニア工科大学 - Wikipedia んで、「カルテック」だとずっと思ってたんだけど「カルテク」もありだったのね。ふむ。

■_ 下人の行方は誰も知らない

御多分に漏れず中学だか高校の国語の教科書で読まされたんだけど、それはそれとしてあるとき、この最後の「下人の行方は誰も知らない」で言おうとしたことは何かということについて面白い意見を聞いたことがあって自分もなんとなくそれで納得していたりするのだった。ざっとみたところ同じようなことを云ってる人はいないっぽいなｗ芥川龍之介の羅生門のラスト。「下人の行方は誰も知らない」と、ありますが、こ... - Yahoo!知恵袋芥川龍之介羅生門

■_

powershell。複数のExcelファイルからデータを抜き出すスクリプトを書いてたりするんだけど、 VBA でやるよりだいぶ良いw。が、たとえば三番目のシートを指定するのに $excel.worksheets.items(3) みたいに書かないといけないのはどうにかならんものかと。 $excel.worksheets[3] のように書ければいいのに。あと、範囲指定する range プロパティが $excel.worksheets.items(3).cells.range("A1:A10") とA1表記でしか書けないのも不満。 $excel.worksheets.items(3).cells.range(@(1,1), @(10,1)) のように数値で指定させて欲しい。カラムの A B C … と対応する数値とを相互変換するの面倒なんだもんw VBA だと cell オブジェクトを使うけれども一応この手のことができなくはない。まあ変換関数一回かきゃあいい話なんですけどね。

2013年10月22日

■_

さてどうしましょうかねえ Kindle Paperwhiteの2013年モデルと2012年モデルでページめくりスピードを比べてみた - GIGAZINE

■_ single dispatch function

ざっと読んだ。デコレーターの活用という見方で良いのかしら? PEP 443 -- Single-dispatch generic functions

PEP 443 -- Single-dispatch generic functions

To define a generic function, decorate it with the @singledispatch decorator. Note that the dispatch happens on
the type of the first argument. Create your function accordingly:

>>> from functools import singledispatch
>>> @singledispatch
... def fun(arg, verbose=False):
...     if verbose:
...         print("Let me just say,", end=" ")
...     print(arg)

To add overloaded implementations to the function, use the register() attribute of the generic function. This
is a decorator, taking a type parameter and decorating a function implementing the operation for that type:

>>> @fun.register(int)
... def _(arg, verbose=False):
...     if verbose:
...         print("Strength in numbers, eh?", end=" ")
...     print(arg)
...
>>> @fun.register(list)
... def _(arg, verbose=False):
...     if verbose:
...         print("Enumerate this:")
...     for i, elem in enumerate(arg):
...         print(i, elem)

こんな感じで定義しておいて

PEP 443 -- Single-dispatch generic functions

>>> fun("Hello, world.")
Hello, world.
>>> fun("test.", verbose=True)
Let me just say, test.
>>> fun(42, verbose=True)
Strength in numbers, eh? 42
>>> fun(['spam', 'spam', 'eggs', 'spam'], verbose=True)
Enumerate this:
0 spam
1 spam
2 eggs
3 spam
>>> fun(None)
Nothing.
>>> fun(1.23)
0.615

こう使うと。詳しい解説などは元記事をどうぞ。

■_

vallog: 「ことばと思考」って本の話
「エスカレーターでは片側を開けず、並んで立つ」を少しずつ浸透させたい - 頭ん中
テストの実行 - 多群最適化
Taking the Long View: Code Generation and Software Maintenance
The panelists discuss if code generation techniques help or hinder long-term software maintenance, and how such techniques can be integrated in the maintenance process.
東京都・神楽坂に「数学体験館」がオープン -五感で数学の理論を実体験 | マイナビニュース
勉強会ドタキャン問題について運営側の立場で考える - @kmizuの日記
Ubuntu 13.10 IBus 1.5による文字入力の面倒な問題 - M59のブログ
Goとモナド - M59のブログ
ちょっとそれをモナドって呼んだらHaskellerやってきますよ? （太宰メソッド）

■_

ああ、あれもこれも中途半端 ○|￣|＿

2013年10月21日

■_

まさか台風に合わせて(ry 今年も来るよ！　マクドナルドの「グラコロ」10月25日発売 - ねとらぼ

なんだろう＞ Single Dispatch Functions What’s New In Python 3.4 | Hacker News これか PEP 443 -- Single-dispatch generic functions

■_ GNU APL

割と黒っぽいテクニック使ってるの発見。 Id.cc


// a number of fixed UCS strings, one for each Id.
//
#define av(x, u) const UCS_string id_ ## x (UNI_ ## u);
#define pp(x, u) const UCS_string id_ ## x (UTF8_string(#x));
#define qf(x, u) const UCS_string id_QUAD_ ## x (UTF8_string("\xe2" "\x8e" "\x95" #x));
#define qv(x, u) const UCS_string id_QUAD_ ## x (UTF8_string("\xe2" "\x8e" "\x95" #x));
#define st(x, u) const UCS_string id_ ## x (UTF8_string(u));

#define id_def(_id, _uni, _val, _mac)   _mac(_id, _uni)
#include "Id.def"

//-----------------------------------------------------------------------------
ostream &
operator << (ostream & out, Id id)
{
   return out << id_name(id);
}
//-----------------------------------------------------------------------------

const UCS_string &
id_name(Id id)
{
   switch(id)
      {
#define av(x) case ID_ ## x: return id_ ## x;
#define pp(x) case ID_ ## x: return id_ ## x;
#define qf(x) case ID_QUAD_ ## x: return id_QUAD_ ## x;
#define qv(x) case ID_QUAD_ ## x: return id_QUAD_ ## x;
#define st(x) case ID_ ## x: return id_ ## x;

#define id_def(_id, _uni, _val, _mac) _mac(_id)
#include "Id.def"
      }

   CERR << "Unknown Id " << HEX(id);
   Assert(0 && "Bad Id");
}
//-----------------------------------------------------------------------------
Function *
get_system_function(Id id)
{
   switch(id)
      {
#define av(x) case ID_ ## x: return &Bif_F12_ROLL::fun;
#define pp(x)
#define qf(x) case ID_QUAD_ ## x: return &Quad_ ## x::fun;
#define qv(x)
#define st(x) 

#define id_def(_id, _uni, _val, _mac) _mac(_id)
#include "Id.def"
      }

   return 0;
}
//-----------------------------------------------------------------------------
Symbol *
get_system_variable(Id id)
{
   switch(id)
      {
#define av(x)
#define pp(x)
#define qf(x)
#define qv(x) case ID_QUAD_ ## x:return &Workspace::the_workspace->v_quad_ ## x;
#define st(x) 

#define id_def(_id, _uni, _val, _mac) _mac(_id)
#include "Id.def"
      }

   return 0;
}
//-----------------------------------------------------------------------------

同じファイルを、マクロを切り替えて何回もinclude するというのは他でも見てきたけどこの id_def はなかなか。

Id.def を見てみるとこんなん。

//      Id              Unicode             = Value    Macro
//----------------------------------------------------------
id_def( No_ID         , ---               , = 0      , pp )
id_def( No_ID1        , ---               , = 1      , pp )
id_def( No_ID2        , ---               , = 2      , pp )
id_def( AF            , ---               , = 0x4101 , qf )
id_def( AI            , ---               ,          , qv )
id_def( ARG           , ---               ,          , qv )
id_def( F2_AND        , AND               ,          , av )
id_def( APL_VALUE     , ---               ,          , pp )
id_def( APL_VALUE1    , ---               ,          , pp )
id_def( APL_VALUE2    , ---               ,          , pp )
id_def( ASSIGN        , LEFT_ARROW        ,          , av )
id_def( AT            , ---               ,          , qf )
id_def( AV            , ---               ,          , qv )
id_def( F12_BINOM     , ASCII_EXCLAM      , = 0x4201 , av )
id_def( BRANCH        , RIGHT_ARROW       ,          , av )
id_def( CHARACTER     , ---               , = 0x4301 , pp )
id_def( F12_CIRCLE    , CIRCLE            ,          , av )
(略)
id_def( VARIABLE      , ---               , = 0x5601 , pp )
id_def( VOID          , ---               ,          , pp )
id_def( WA            , ---               , = 0x5701 , qv )
id_def( F12_WITHOUT   , TILDE_OPERATOR    ,          , av )
//-----------------------------------------------------------------------------

#undef id_def
#undef av
#undef pp
#undef qf
#undef qv
#undef st

unicode の欄にあるのがキャラクターの名前みたいですね。最後の欄の「marco」ってのが気になったんですが、これ #define id_def(_id, _uni, _val, _mac) _mac(_id, _uni) の _mac に当たるんですね。こういう呼び出し方できたのかー。

それと

#define qf(x, u) const UCS_string id_QUAD_ ## x (UTF8_string("\xe2" "\x8e" "\x95" #x));
#define qv(x, u) const UCS_string id_QUAD_ ## x (UTF8_string("\xe2" "\x8e" "\x95" #x));

にでてくる xe2 x8e x95 は□みたいなキャラクターの模様。 UTF8 3byte(e2)

Id.hh にちょっと説明があった。

/*!
 An Identifier for each internal object (primitives, Quad-symbols, and more).
 The ID can be derived in four ways:

 1. from the name of an ?AV element, e.g.  ID_F2_AND or ID_ASSIGN
 2. from a name,                     e.g.  ID_APL_VALUE or ID_CHARACTER
 3. from a distinguished var name,   e.g.  ID_QUAD_AI or ID_QUAD_AV
 4. from a distinguished fun name,   e.g.  ID_QUAD_AT or ID_QUAD_EM
 5. from a special token name              ID_L_PARENT1 or ID_R_PARENT1

  This is controlled by 5 corresponding macros: av() pp() qv() qf() resp. st()
 */

enum Id
{
#define av(x, v) ID_      ## x v,
#define pp(x, v) ID_      ## x v,
#define qf(x, v) ID_QUAD_ ## x v,
#define qv(x, v) ID_QUAD_ ## x v,
#define st(x, v) ID_      ## x v,

#define id_def(id, _uni, val, mac) mac(id, val)
#include "Id.def"
};

■_

加藤直之氏の銀河艦娘伝説 - Togetter
脳は睡眠中に「ゴミ捨て」をしている…米グループが実証 - Technity
スキル差の存在を前提としたJava開発の私の理想 - eller's blog
辻本昭夫「電子部品」客の求めるままにデジタル時代の重要人物に訊く「実践マーケティング戦略」第4回：PRESIDENT Online - プレジデント
技術/歴史/zip,gzip,zlib,bzip2 - Glamenv-Septzen.net
共変、反変の記号、negative positionやpositive positionの由来 - Togetter
秘密結社Metasepi作戦会議第6回議事録 - Metasepi
Gauche > Archives > 2013/10/21
ゲームボーイの CPU【日記】 - 魔法使いの森
でも、日本の wikipedia は間違いだらけだから、と思って、英語版記事を見た。
主に日本語版の wikipedia に疑念を抱いて、Z80 なのか 8080 なのか躍起になって調べたのだが、日本語版の wikipedia の内容が「根拠なく書かれていた」と言うだけの話でした。

■_ str*

ruby-core

[ruby-core:57932] strlen and strnlen in Ruby

Hi there, Is there a reason why strcpy is used in some places but strncpy in others?
(IÃ×e been dogmatically following the advice of my elders to favour strncpy whenever possible since itÃÔ easy
to run into security issues or accidentally feeding a non-null-byte-terminated string in and having the program
crash or worse yet, use the result without checking.)

Edward

[ruby-core:57933] Re: strlen and strnlen in Ruby

On Friday, 18 October 2013 at 12:03 PM, Edward Ocampo-Gooding wrote:
> Is there a reason why strcpy is used in some places but strncpy in others?
>
>

I don't think there's any real convention in the CRuby codebase.

> (Ie been dogmatically following the advice of my elders to favour strncpy whenever possible since it easy to run into
> security issues or accidentally feeding a non-null-byte-terminated string in and having the program crash or worse yet,
> use the result without checking.)

Keep in mind strncpy has its own flaws - if the source string is exactly the length of the destination buffer,
it won't null terminate.

Personally I've never really been a fan of strncpy. strcpy is perfectly safe if you know what you're doing.
If you don't, you'll probably run intoroblems with strncpy too.

[ruby-core:57934] Re: strlen and strnlen in Ruby

Edward Ocampo-Gooding <edward / edwardog.net> wrote:
> (I¡Çve been dogmatically following the advice of my elders to favour
> strncpy whenever possible since it¡Çs easy to run into security issues
> or accidentally feeding a non-null-byte-terminated string in and
> having the program crash or worse yet, use the result without
> checking.)

strncpy is wrong in many cases used since it pads with trailing zeros.
AFAIK strncpy is a historical artifact from an ancient database format.

There's also strlcpy from OpenBSD.  strlcpy is safe as far as crashes
go, but silently truncating data leads to other problems.

So memcpy is preferable for correctness, and heavily-used in Ruby
already since the length of Ruby strings is known.

I haven't taken the time to audit the existing uses of str*cpy in Ruby,
but I suspect many are for convenience and non-critical paths..

あ、あった

addr2line.c:    strncpy(binary_filename, path, len);
ruby.c:         p = strncpy(RSTRING_PTR(buf), p, len);
ruby.c: strncpy(libpath, rubylib, sizeof(libpath));

addr2line.c:    strcpy(subdir, binary_filename);
addr2line.c:    strcpy(binary_filename, global_debug_dir);
dln.c:#define DLN_ERROR() (error = dln_strerror(), strcpy(ALLOCA_N(char, strlen(error) + 1), error))
dln.c:  strcpy(file, orig);
process.c:          new_argv[1] = strcpy(ALLOC_N(char, strlen(argv[0]) + 1), argv[0]);
util.c:    strcpy(buf, ".");