ときどきの雑記帖″ 2014年3月(中旬)

2014年03月20日

■_

この話。富士ゼロックスの「DocuWorks 8」に不具合、ドライブ内の全ファイル消失の可能性も | スラッシュドット・ジャパン IT 【緊急】DocuWorks 8 (バージョン：8.0.3) において、特定の条件下でPDF から DocuWorksへの変換をおこなうとファイルが消失する不具合についてのお詫びと対処のお願い : 企業情報 : 富士ゼロックス

某社にも似たようなプロダクトがありまして。「同じような問題(バグ)がないか至急調査、報告せよ」と。まあそれはいーんですが、このような致命的な問題がないことはどのようにすれば「証明」and/or 「保証」できるんでしょうかね。オチはありません。

■_

Apple のアレのすぐあとにこの話があったんですがこっちはあまり話題にならなかった感じが。 GnuTLS vulnerability: is unit testing a matter of language culture? | Jan-Philip Gehrcke

GnuTLS vulnerability: is unit testing a matter of language culture? | Jan-Philip Gehrcke

Obviously, media and tech bloggers pointed out the significance of this issue. If you are interested in some
technical detail, I would like to recommend a well-written article on LWN on the topic: A longstanding GnuTLS
certificate validation botch. As it turns out, the bug was introduced by a code change that re-factored the
error/success communication between functions. Eventually, spoken generally, the problem is that two
communication partners went out of sync: when the sender sent ‘Careful, error!’, the recipient actually
understood ‘Cool, success.’. Bah. We are used to modern, test-driven development culture. Consequently, most
of us immediately think “WTF, don’t they test their code?”.

goto fail の方も含めていくつか pocket に放り込みっぱなしのものがー ○|￣|＿

■_

Crazy for life（セイカツイチバン、IT ニバン）: 神はシステムの細部にも宿る
企業大研究カリスマ経営者が消えたらあの会社はどうなってしまうのか?ひとりの天才に支えられた組織はこんなにモロい　セブン&アイ・鈴木敏文　ソフトバンク・孫正義　ユニクロ・柳井正　スズキ・鈴木修 | 経済の死角 | 現代ビジネス [講談社]
Managing your Software Debt
Software Laws 2014 (PDF) : programming
twitter の TL でも同様な内容の日本語記事紹介ツイートを見かけましたが。
Positive Bias and Testing : programming

2014年03月19日

■_ GNU grep

プロファイル取りながらボトルネックを確認したいんだけど (場所よりはその程度を)、環境作るのが面倒(ry

それはさておき。 multibyte の検索のときに速度低下する原因であろう関数二つ。

/* searchutils.c - helper subroutines for grep's matchers.
   Copyright 1992, 1998, 2000, 2007, 2009-2014 Free Software Foundation, Inc.

(略)
char *
mbtolower (const char *beg, size_t *n, mb_len_map_t **len_map_p)
{
  static char *out;
  static mb_len_map_t *len_map;
  static size_t outalloc;
  size_t outlen, mb_cur_max;
  mbstate_t is, os;
  const char *end;
  char *p;
  mb_len_map_t *m;
  bool lengths_differ = false;

  if (*n > outalloc || outalloc == 0)
    {
      outalloc = MAX(1, *n);
      out = xrealloc (out, outalloc);
      len_map = xrealloc (len_map, outalloc);
    }

  /* appease clang-2.6 */
  assert (out);
  assert (len_map);
  if (*n == 0)
    return out;

  memset (&is, 0, sizeof (is));
  memset (&os, 0, sizeof (os));
  end = beg + *n;

  mb_cur_max = MB_CUR_MAX;
  p = out;
  m = len_map;
  outlen = 0;
  while (beg < end)
    {
      wchar_t wc;
      size_t mbclen = mbrtowc (&wc, beg, end - beg, &is);
#ifdef __CYGWIN__
      /* Handle a UTF-8 sequence for a character beyond the base plane.
         Cygwin's wchar_t is UTF-16, as in the underlying OS.  This
         results in surrogate pairs which need some extra attention.  */
略()
#endif
      if (outlen + mb_cur_max >= outalloc)
        {
          size_t dm = m - len_map;
          out = x2nrealloc (out, &outalloc, 1);
          len_map = xrealloc (len_map, outalloc);
          p = out + outlen;
          m = len_map + dm;
        }

      if (mbclen == (size_t) -1 || mbclen == (size_t) -2 || mbclen == 0)
        {
          /* An invalid sequence, or a truncated multi-octet character.
             We treat it as a single-octet character.  */
          *m++ = 0;
          *p++ = *beg++;
          outlen++;
          memset (&is, 0, sizeof (is));
          memset (&os, 0, sizeof (os));
        }
      else
        {
          size_t ombclen;
          beg += mbclen;
#ifdef __CYGWIN__
          /* Handle Unicode characters beyond the base plane.  */
(略)
#endif
          ombclen = wcrtomb (p, towlower ((wint_t) wc), &os);
          *m = mbclen - ombclen;
          memset (m + 1, 0, ombclen - 1);
          m += ombclen;
          p += ombclen;
          outlen += ombclen;
          lengths_differ |= (mbclen != ombclen);
        }
    }

  *len_map_p = lengths_differ ? len_map : NULL;
  *n = p - out;
  *p = 0;
  return out;
}

cygwin 固有のところはサロゲートペアをごにょごにょしているので大まかなロジックを見るにはなくても構わないだろうということで削ってます。この関数は、mutibyte locale かつ大小文字を無視する設定のときに呼ばれ、検索対象の内容を「小文字化」したものを返します。ここでも wide char 変換→小文字変換→narrow char 変換ということをひたすらやるので時間を食うのは明らかですね。そして、小文字化したものを収める領域を確保しているのですが、 ignore case な検索を指定したときに「一行ごとの検索」を行うようになっているのはたぶんここの処理の影響でしょうね。

そしてもう一つ。こっちはcase sensitive なときにも通ると思うんで、影響の出方が今ひとつ把握できないのだけど

bool
is_mb_middle (const char **good, const char *buf, const char *end,
              size_t match_len)
{
  const char *p = *good;
  const char *prev = p;
  mbstate_t cur_state;

  /* TODO: can be optimized for UTF-8.  */
  memset(&cur_state, 0, sizeof(mbstate_t));
  while (p < buf)
    {
      size_t mbclen = mbrlen(p, end - p, &cur_state);

      /* Store the beginning of the previous complete multibyte character.  */
      if (mbclen != (size_t) -2)
        prev = p;

      if (mbclen == (size_t) -1 || mbclen == (size_t) -2 || mbclen == 0)
        {
          /* An invalid sequence, or a truncated multibyte character.
             We treat it as a single byte character.  */
          mbclen = 1;
          memset(&cur_state, 0, sizeof cur_state);
        }
      p += mbclen;
    }

  *good = prev;

  if (p > buf)
    return true;

  /* P == BUF here.  */
  return 0 < match_len && match_len < mbrlen (p, end - p, &cur_state);
}
#endif /* MBS_SUPPORT */

この関数では、マッチするものがみつかったときにそれが文字の切れ目を跨いだものでないかチェックしています。ここだけ見るとバッファの先頭からチェックするようにも読めるんですが、たぶんこれを呼び出すところで改行を印にして、それより前には行かないようにしているぽいです。コメント(どこにあったのか忘れたw)には、 multibyte locale で、マッチするものがたくさん見つかるようなケースではいちいち先頭から切れ目チェックをするので遅くなるから line by line で処理するよとかあったような。あ、あとUTF-8だと分かっている場合にはこのチェックがとても簡単にできるのは説明するまでもないですね。 2.17 あたりでタナカさんが手を入れた部分にそういうコードがあります。

■_

grep -Pなら速度低下あまりなし - jarp,

PCRE は utf-8 以外の multibyte encoding には対応してなかったよなあと思い出しつつちょっと眺めてみる

/* pcresearch.c - searching subroutines using PCRE for grep.
   Copyright 2000, 2007, 2009-2014 Free Software Foundation, Inc.

(略)

void
Pcompile (char const *pattern, size_t size)
{
#if !HAVE_LIBPCRE
  error (EXIT_TROUBLE, 0, "%s",
         _("support for the -P option is not compiled into "
           "this --disable-perl-regexp binary"));
#else
  int e;
  char const *ep;
  char *re = xnmalloc (4, size + 7);
  int flags = PCRE_MULTILINE | (match_icase ? PCRE_CASELESS : 0);
  char const *patlim = pattern + size;
  char *n = re;
  char const *p;
  char const *pnul;

# if defined HAVE_LANGINFO_CODESET
  if (STREQ (nl_langinfo (CODESET), "UTF-8"))
    {
      /* Enable PCRE's UTF-8 matching.  Note also the use of
         PCRE_NO_UTF8_CHECK when calling pcre_extra, below.   */
      flags |= PCRE_UTF8;
    }
# endif

  /* FIXME: Remove these restrictions.  */
  if (memchr (pattern, '\n', size))
    error (EXIT_TROUBLE, 0, _("the -P option only supports a single pattern"));

  *n = '\0';
  if (match_lines)
    strcpy (n, "^(");
  if (match_words)
    strcpy (n, "\\b(");
  n += strlen (n);

  /* The PCRE interface doesn't allow NUL bytes in the pattern, so
     replace each NUL byte in the pattern with the four characters
     "\000", removing a preceding backslash if there are an odd
     number of backslashes before the NUL.

     FIXME: This method does not work with some multibyte character
     encodings, notably Shift-JIS, where a multibyte character can end
     in a backslash byte.  */
  for (p = pattern; (pnul = memchr (p, '\0', patlim - p)); p = pnul + 1)
    {
      memcpy (n, p, pnul - p);
      n += pnul - p;
      for (p = pnul; pattern < p && p[-1] == '\\'; p--)
        continue;
      n -= (pnul - p) & 1;
      strcpy (n, "\\000");
      n += 4;
    }

  memcpy (n, p, patlim - p);
  n += patlim - p;
  *n = '\0';
  if (match_words)
    strcpy (n, ")\\b");
  if (match_lines)
    strcpy (n, ")$");

  cre = pcre_compile (re, flags, &ep, &e, pcre_maketables ());
  if (!cre)
    error (EXIT_TROUBLE, 0, "%s", ep);

(略)

うーん特にコード変換の類はしてないような。FIXME のところに怪しげなコメントもあるし。

■_

2014年03月18日

■_

例によって重箱の隅国立国会図書館はお気づきでしょうか？時代の流れに竿をさそうとは思いませんが、この使い方「どっち」を意図してるんだろう「流れに棹さす」の意味は変わった?! - 言語郎－B級「高等遊民」の妄言あと棹と竿って同じもの?(調べません)

いろいろなbotがいるなあというかこういうデータ取ってるのね

19:55確認-20:48見込東急東横線新丸子駅で人身事故運転見合わせ再開傾向は以下の通り平均53／中央57／最頻10／標準偏差1σ±31／最短5／最長131分割合：平±10分27％／平±標27％／平-標未満27％／平+標以上19％データ数79(駅)
— TRIES/鉄道復旧予測 (@TRIES_rescuenow) 2014, 3月 18

【東急東横線】19:45頃、新丸子駅で発生した人身事故の影響で、一部区間で運転を見合わせています。統計から推測される再開時刻は20:49±27分です。 http://t.co/cxIqSnWjkZ
— レスキューナウ危機管理情報センター (@level4_r) 2014, 3月 18

東急東横線新丸子駅で人身事故運転再開（レスキューナウニュース） - Y!ニュース http://t.co/pb6CkxE46c 見合わせ時間：100分（19:45～21:25確認）同線の平均53＋標準偏差31分を上回る時間でした。母集団79件に対する偏差値は65.2でした。
— TRIES/鉄道復旧予測 (@TRIES_rescuenow) 2014, 3月 18

■_ Memory management in C programs

そういえば esr がつくったあのサイトはその後どうなったんだろうか Memory management in C programs | Hacker News Memory management in C programs

Memory management in C programs

Memory management in C programs

One large difference between C and most other programming languages is that in C, you have to handle memory
yourself rather than having a garbage collector do it for you. Ensuring that memory is allocated at the correct
moment is not very difficult (and something that needs to be done manually in pretty much every language); the
hard part is to ensure that enough memory is allocated, and to ensure that the memory is deallocated when it is
no longer in use.

There are several techniques available for memory management in C. Many of them are used in NetHack 3.4.3; and
even more are used somewhere in NetHack 4. In this blog post, I'd like to look at them and discuss their
advantages and disadvantages. I'm mostly concerned about correctness, rather than efficiency, here; that means
that unless the performance difference is very large, I care more about clean code than I do about fast code.

■_ 3.4.0

Python Insider: Python 3.4.0 released

Python Insider: Python 3.4.0 released

Python 3.4 includes a range of improvements of the 3.x series, including hundreds of small improvements and bug
fixes. Major new features and changes in the 3.4 release series include:

    PEP 428, a "pathlib" module providing object-oriented filesystem paths
    PEP 435, a standardized "enum" module
    PEP 436, a build enhancement that will help generate introspection information for builtins
    PEP 442, improved semantics for object finalization
    PEP 443, adding single-dispatch generic functions to the standard library
    PEP 445, a new C API for implementing custom memory allocators
    PEP 446, changing file descriptors to not be inherited by default in subprocesses
    PEP 450, a new "statistics" module
    PEP 451, standardizing module metadata for Python's module import system
    PEP 453, a bundled installer for the *pip* package manager
    PEP 454, a new "tracemalloc" module for tracing Python memory allocations
    PEP 456, a new hash algorithm for Python strings and binary data
    PEP 3154, a new and improved protocol for pickled objects
    PEP 3156, a new "asyncio" module, a new framework for asynchronous I/O

ふむん。 PEP 445の new C API ってどんなんだろか。あと 450 の統計モジュール気になる。

■_

書店でとんと見かけないのでポチってみたこれならなっとく C言語へのいざない

2014年03月17日

■_

東横線(の一部)の駅でホームドアがついに稼働してた (実際には15日からだったらしい)。

■_

GNU grep の関数呼び出しのフローは大雑把にこんな感じ

compile (GEAcompile Gcompile Ecompile Acompile Fcompile Pcompile)
↓
grep_command_line_arg
 grepfile
 grepdesc
 grep
   filbuf
 grepbuf
 do_execute
 execute    (EGexecute Fexecute Pexecute)

■_

そして今日もタナカさんが改良を投稿していた bug#17013: [PATCH] grep: optimization by using the Galil rule for Boyer-

bug#17013: [PATCH] grep: optimization by using the Galil rule for Boyer-

The Boyer-Moore algorithm runs in O(m n) in the worst case,
 which perhaps it may be much slower than the DFA.

The Galil rule enables to change O(m n) into O(n) for its case without
overheads and/or slow-down for other cases by avoiding to compare more
than once for a position in the text.  This patch implements it.

I prepare following string, which makes a worst case for Boyer-Moore
algorithm, to measure the performance.

    yes jjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj | head -10000000 > ../k

I run the test with the patch (best-of-5 trials):

    env LC_ALL=C time -p src/grep kjjjjjjjjjjjjjjjjjjj k
        real 0.70       user 0.32       sys 0.38

Back out that commit (temporarily), recompile, and rerun the experiment:

    env LC_ALL=C time -p src/grep kjjjjjjjjjjjjjjjjjjj k
        real 3.97       user 3.56       sys 0.40

Galil rule ってなんだったっけ。とぐぐる

ボイヤー-ムーア文字列検索アルゴリズム - Wikipedia

ガリル規則

1979年、Zvi Galil はボイヤー-ムーア法に単純だが重要な改良を施した[4]。追加されたガリル規則はシフト量を決める
ものではなく、各位置での照合を高速化するものである。位置 k1 で P と T を照合して T 上の文字 c まで照合し、次
にシフトした位置 k2 によりパターンの先頭の位置が c と k1 の間になったとき、P のプレフィックスは部分文字列
T[(k2 - n)..k1] と必ず一致する。したがってこの際の文字照合は T の k1 の位置まででよく、k1 より前の照合は省略
できる。ガリル規則はボイヤー-ムーア法の効率を向上させるだけでなく、最悪ケースでも線型時間であることを保証する
のに必須である。

んー、記憶にない。

このページの解説他の部分も含めて詳しいなあ検索アルゴリズム (4) 文字列の検索 -2-

検索アルゴリズム (4) 文字列の検索 -2-

*1-1) The Good Suffix Rule を適用した場合、前回の照合時に末尾からたどって不一致を起こした個所までの文字列は一致
していることが保証されています。特に、パターンの末尾と先頭が一致している場合 (二番目に示した例、サンプル・プロ
グラムでは三番目の処理で得られる移動量) は、パターンの先頭が前回の不一致個所と末尾の間に位置し、しかも前回比較を
開始した末尾部分までは一致していることが保証されるので、その範囲の比較は不要になります。この最適化を「ガリル規則
(Galil Rule)」といいます。以下の例を見ると理解しやすいと思います。

表 1-7. Galil Rule の例 ...	X	Y	Z	X	Y	Z	...	← テキスト
	Y	Z	Y	X	Y	Z		← パターン
...	X	Y	Z	X	Y	Z	...
	Y	Z	Y	X	Y	Z	← パターン先頭の "YZ" は一致していることが保証される

■_

今日は二時間も浪費させられたので(いつにも増して)キリョクがない

2014年03月16日

■_

午後に、散歩に出かけたときに地鎮祭を目撃。これから家建てるンすね。

あー喰いてえーー炸醬麵 - Wikipedia 盛岡じゃじゃ麺 - Wikipedia

WOWOW でイデオンがーーー契約してないから観られねえ○|￣|＿テレビ初「伝説巨神イデオン」ハイビジョン版一挙放送！｜アニメ｜WOWOWオンライン全39話のテレビ版と、4月放送の劇場版『伝説巨神イデオン　接触篇』『伝説巨神イデオン　発動篇』（ともに’82）を一挙放送する。接触篇・発動篇もやるんだ。

■_

トピックス | スーパープレゼンテーション｜Eテレ NHKオンライン土曜日の放映を録画してたのを今日になって観たのですがいやーよかった。前半がこの人の話だったんですが SEから世界一に返り咲き　「TED 2013」に唯一出演した日本人・BLACKさんのヨーヨーと“再起” (1/3) - ITmedia ニュース世界中が注目する講演会『TED』に、ヨーヨー世界チャンピオンが挑戦！ BLACK: My journey to yo-yo mastery | Talk Video | TED 話す内容が肝心なんだなあ。やっぱり。

ブログ発見。ヨーヨー世界チャンピオンBLACKオフィシャルブログ「BLACK’s Yo-Yo life」 Powered by Ameba

■_ LANG

1/35 ミリタリーミニチュアシリーズ No.88 ドイツ IV号駆逐戦車ラング35088

天の声が twitter 経由で聞こえてきたのでLANG 環境変数についてちょいと。

文字列の照合順序(Collation) に、 LC_COLLATE 指定する以外では、環境変数 LC_ALL を指定した場合も collation は設定される。ただし環境変数 LANG は collation を設定しないので注意。とありますがこれはたぶん勘違いかあるいは試した環境の問題で、 POSIX的にはこうなっています。

Environment Variables

8.2 Internationalization Variables

This section describes environment variables that are relevant to the operation of internationalized interfaces
described in IEEE Std 1003.1-2001.

Users may use the following environment variables to announce specific localization requirements to applications.
Applications can retrieve this information using the setlocale() function to initialize the correct behavior of
the internationalized interfaces. The descriptions of the internationalization environment variables describe
the resulting behavior only when the application locale is initialized in this way. The use of the
internationalization variables by utilities described in the Shell and Utilities volume of IEEE Std 1003.1-2001
is described in the ENVIRONMENT VARIABLES section for those utilities in addition to the global effects
described in this section.

LANG
This variable shall determine the locale category for native language, local customs, and coded character
set in the absence of the LC_ALL and other LC_* ( LC_COLLATE , LC_CTYPE , LC_MESSAGES , LC_MONETARY ,
LC_NUMERIC , LC_TIME ) environment variables. This can be used by applications to determine the language to
use for error messages and instructions, collating sequences, date formats, and so on.

LC_ALL
This variable shall determine the values for all locale categories. The value of the LC_ALL environment
variable has precedence over any of the other environment variables starting with LC_ (LC_COLLATE,
LC_CTYPE, LC_MESSAGES, LC_MONETARY, LC_NUMERIC, LC_TIME) and the LANG environment variable.

LC_COLLATE
This variable shall determine the locale category for character collation. It determines collation
information for regular expressions and sorting, including equivalence classes and multi-character
collating elements, in various utilities and the strcoll() and strxfrm() functions. Additional semantics of
this variable, if any, are implementation-defined.

(略)

The values of locale categories shall be determined by a precedence order; the first condition met below
determines the value:

If the LC_ALL environment variable is defined and is not null, the value of LC_ALL shall be used.

If the LC_* environment variable (LC_COLLATE, LC_CTYPE, LC_MESSAGES, LC_MONETARY, LC_NUMERIC, LC_TIME)
is defined and is not null, the value of the environment variable shall be used to initialize the category
that corresponds to the environment variable.

If the LANG environment variable is defined and is not null, the value of the LANG environment variable shall be used.

If the LANG environment variable is not set or is set to the empty string, the implementation-defined
default locale shall be used.

If the locale value is "C" or "POSIX", the POSIX locale shall be used and the standard
utilities behave in accordance with the rules in POSIX Locale for the associated category.

If the locale value begins with a slash, it shall be interpreted as the pathname of a file that was created in
the output format used by the localedef utility; see OUTPUT FILES under localedef. Referencing such a pathname
shall result in that locale being used for the indicated category.

つーことで優先順位は低いものの、LANG も collation に影響を及ぼす場合があります。

■_

近刊で Amazon.co.jp： The CERT® C Coding Standard, Second Edition: 98 Rules for Developing Safe, Reliable, and Secure Systems (2nd Edition) (SEI Series in Software Engineering): Robert C. Seacord: 洋書というのを見かけたのだけど、あれすでに出てなかったっけ? と思いつつ調べたら Secure Coding in C and C++ (2nd Edition) (SEI Series in Software Engineering): Robert C. Seacord: 9780321822130: Amazon.com: Books と勘違いしていた模様。

それと Amazon.co.jp：初めてのC++プログラミング初学者にささげる問題集＆解答解説集 - ～アルゴリズム編～（2014年度版） (MyISBN - デザインエッグ社): 中山功一: 本というのも見かけたのだけど

Amazon.co.jp： 初めてのC++プログラミング 初学者にささげる問題集＆解答解説集 - ～アルゴリズム編～（2014年度版） (MyISBN - デザインエッグ社): 中山 功一: 本

内容紹介

C/C++言語の勉強を始めて1年以内の初学者のための50問とその解答，および丁寧な解説を掲載しました．必要な知識は，
変数定義，条件分岐，繰り返し，入出力の4つだけで，本書内でも解説しています．多くの書籍では，これ以上の知識を
要求します．プログラミング初学者がこのような本を読んでも，“プログラミング言語の知識”の勉強で行き詰まって
しまいます．本書は，できるかぎり“特定のプログラミング言語の知識”を使わずに，“プログラムを作る能力”を身に
つけることを目指した問題集です．

うーむ…

■_

mrubyを小さくしたり大きくしたりした話 - スペクトラム
An introduction to M4
なんだってー＞POSIXで標準化もされています。
いや。当然ちゃ当然か。
Twitter / emaxser: @kazh98 ねー、ねー、Gaucheをgithubからク ...
Security Fundamentals: (01) Understanding Security Layers | Security Fundamentals | Channel 9
「「技術的負債」を問いなおす」というタイトルでJAWS DAYS 2014で話してきた #jawsdays - delirious thoughts
hirax.net::３０年以上前の本を覆っていた紙の裏に印刷された「ソースコード」
（3/3）週末スペシャル - 「デジタル耳せん」って何？実際に試してみた！：ITpro
ただし、ほかの人の話し声が気になって作業に集中できない場合は、あまり効果的ではないと言えるだろう。
うそーん○|￣|＿
消費税増税がゲームセンターを直撃する：日経ビジネスオンライン

■_

bug-grep (date)

まだ動きあるなあ。

2014年03月15日

■_

へねぱた第5版見かけた。ヘネシー&パターソンコンピュータアーキテクチャ定量的アプローチ第5版

こっちもどうすっかなーアルゴリズムイントロダクション第3版総合版 (世界標準MIT教科書)

今日見かけたこれも気になry ティムール帝国 (講談社選書メチエ)

■_ Theo さまにおうかがい

ちょっと前に HN 経由(たぶん)で知ったもの。 Interview: Ask Theo de Raadt What You Will - Slashdot

Interview: Ask Theo de Raadt What You Will - Slashdot

Theo de Raadt was a founding member of NetBSD, and is the founder and leader of the OpenSSH and OpenBSD projects.
He is currently working on OpenBSD 5.5 which would be the projects 35th release on CDROM. Even though he'd
rather be hiking in the mountains or climbing rocks in his free time, Theo has agreed to answer any question you
may have. As usual, ask as many as you'd like, but please, one question per post.

結構ボリュームあるなー

■_

そこかしこで色々話の出ているアレ。 Wolfram Language Demo - Business Insider

Wolfram Language Demo - Business Insider

Controversial mathematician Stephen Wolfram is about to release a programming language with the goal of being
able to quickly do just about any calculation or visualization on just about any kind of data a person could
want.

Wolfram, creator of the widely used mathematical software Mathematica and the "computational knowledge
engine" Wolfram|Alpha, has announced the forthcoming release of the Wolfram Language, the underlying
programming language powering those two pieces of software.

そういえば Mathematica の言語がどうとかいうのも最近 InfoQ で見かけたような。お、これだこれだ The Secret Life of a Mathematica Expression The Secret Life of a Mathematica Expression at QCon San Francisco 2013 | Lanyrd

Slideshare にもおいてあった。 The Secret Life of a Mathematica Expression

2014年03月14日

■_

4月からの放送大学の番組をチェック。幾つか気になるものがあるけど全部を視聴してる時間はないなあ。テレビ番組表｜放送授業期間ラジオ番組表｜放送授業期間

付箋紙の話がでましたが、今度はラインマーカー。有料のメールマガジンからで申し訳ないんですが、 3/1発行の号 (押井守の「世界の半分を怒らせる」。第34号:押井守の「世界の半分を怒らせる」。:押井守の「世界の半分を怒らせる」。(押井守（と不快な仲間たち）) - ニコニコチャンネル:社会・言論) で、ラインマーカーにもノック式のものがあるのを知りました。押井さんによれば無印良品のノック式マーカーです。　色は黄色に限ります。だそうで。ポリプロピレンノック式蛍光ペン　黄色 | 無印良品ネットストア調べてみると結構いろいろなところが出してたんですね。気がつかなかったわ。

ぺんてるハンディラインS ノック式蛍光ペン 5色セット SXNS15-5

■_

ニュース - 「自動車産業をオープン化する」、トヨタやBMWがAutomotive Linux Summitで講演：ITpro

エネルギー技術：台風から思い付いた――「オシアナス」を支える新型ソーラーセル (1/2) - EE Times Japan

■_ 軟件考古学者

いいなあ。軟件考古学者。 Programming Exercises で見たんだけど。

あ、speakerdeck にあったw Programming Exercises - Terry Yin - Agile SG 2013 // Speaker Deck

2014年03月13日

■_

読書におすすめの付箋は「ポスト・イット透明スリム見出し」だ！ - マトリョーシカ的日常というのを見かけました。わたしのお気に入りはこれ。ちょっと割高になっちゃうけど、本に栞のように挟んでおけるので本を持ち歩いて隙間時間に読むことの多いわたしには重宝します。カバンなんかにいれたのをいちいち探すのは面倒だし、本の見返しなんかに貼れるようになっているカンミ堂ココフセン CF－4002 グレークロケース
 や、クレジットカードサイズのケースに収まったカンミ堂ココフセンカードカラーS CF－5002
というのもあったけど、これは本に「貼る」(剥がす)ということをしないでよいのが助かります。む。なんか見つからないものがあるな amazon さんw ミドリ | 商品カタログ | 付せん／インデックス・ラベル

■_

InfoQ にもあるこれ(向こうだとユーザー登録＆ログインしないとダウンロードできなかったり) を見つけたときに、ふと「software craftmanship」をタイトルに含むものがほかにもいくつかあるのに気がついた Software Craftsmanship: Paying Attention to Quality by Ken Auer // Speaker Deck What happens when Software Craftsmanship meets User Experience? // Speaker Deck CAST 2013: Software Craftsmanship for Testers // Speaker Deck

■_

PEP 463: Exception-catching expressions has been rejected : Python [Python-Dev] Requesting pronouncement on PEP 463: Exception-catching expressions reject ってんで気になって見に行ったんだけど PEP 463 -- Exception-catching expressions

PEP 463 -- Exception-catching expressions

# LBYL:
if key in dic:
    process(dic[key])
else:
    process(None)
# As an expression:
process(dic[key] if key in dic else None)

# EAFP:
try:
    process(dic[key])
except KeyError:
    process(None)
# As an expression:
process(dic[key] except KeyError: None)

ふむ。

■_

第二版とか The Pragmatic Bookshelf | Metaprogramming Ruby, Second Edition now in beta

The Pragmatic Bookshelf | Metaprogramming Ruby, Second Edition now in beta

This completely revised new edition covers the new features in Ruby 2.0 and 2.1, and contains code from the
latest Ruby libraries, including Rails 4. Most examples are new, "from the wild," with more recent
libraries. And the book reflects current ideas of when and how much metaprogramming you should use.

Whether you're a Ruby apprentice on the path to mastering the language or a Ruby wiz in search of new tips, this
book is for you.

If you own the first edition: you'll find an upgrade coupon for 35% in your account. Enjoy!

ベータ版買っちゃおうかな

■_

2014年03月12日

■_

こういう見だし。英国で最も古い原子力潜水艦、原子炉を交換へ « WIRED.jp 現役の英国の原子力潜水艦としては最も古く、これまでも構造上欠陥があると指摘されてきた英国の原子力潜水艦「ヴァンガード」が、古い原子炉を新しい原子炉に交換することになった。「現役の」って抜いて良いものなんかなあ。

これ面白かった。確かに自転車そのものの撤去はたまにしかやってないから効果薄いよなあとは思っていた。違法駐輪対策のプレゼンテーション

■_

あなたの好きなそのプログラミング言語は… Your Favorite Programming Language Sucks | Py Skool

Your Favorite Programming Language Sucks | Py Skool

3. Java Hi there! I want someone who will write overly complicated and verbose code. I also need my JVM to
crash every other day with the latest security hole. As for GUI, we will settle for a piece of crap that looks
the same on every OS. You say you can help? Great!

(略)

6. Lisp, Haskell and other functional languages Of all the languages, none are more contemptible than the so
called mathematically elegant functional languages. At least the other languages solve a real problem. The
Lisp/Haskell crowd lives in a dream world where they wear a monocle, drink fine wine and write elegant code.
They always show the same toy examples. “If you are writing this convoluted made up example, Haskell is better
than C.” Sure it is, grandpa.

No, reading SICP or the elephant book did not improve my programming. Reading 50 shades of grey, or any of the
vampire romance novels will teach you more about programming than learning Scheme or Haskell will.

7. Delphi/Pascal *Snort*

(略)

13. Any language not mentioned here Your lanaguge sucks so much, I can’t even be bothered writing about it.

14. LolCode and Brainfuck Bravo! Finally, someone who knows how to write the codez. Pat yourself on the back.

If I have forgotten any language, feel free to insult it in the comments.


· © 2014 Py Skool · Designed by Themes & Co ·

Lisp/Haskell のところで出てくる the elephant book って… あ、これか!

■_ すべてはFから始まった?

調べものをしてたら CiNii 論文 - ベイジアンネットに基づくソフトウェア開発工程の最終品質予測モデルの提案を見つけて、そこに

 以上のような背景から残存不具合数の予測手法は多く提案されている[11]．例えば，ソフトウェア信頼度成長モデル(SRGM)
はテスト工程中に発見された不具合の数の推移から残存する不具合の数を予測する．ただし，この手法では単一のテスト工程
のみが対象であり，設計工程やプロダクトの品質は考慮されていない．
 一方，プロダクトから得られるメトリクスを用いた手法としては統計的分析を基にしたものが多く知られている．特に，回
帰分析を用いた手法は多く試みられており，[3] や[9]などが知られている．しかし，こうした回帰分析をベースとした手法
には問題があることが指摘されている[5]．その最も特徴的な問題は，従属変数の独立性を保証しなければならないという点
である．実際のソフトウェアメトリクスでは相関性の高いメトリクスが多く，そのため同時に使用できるメトリクスを用いた
回帰モデルを構築することが困難であると述べられている．
 Fenton らは，[5] において回帰分析を用いた手法で発生する問題点を克服する為の手法としてベイジアンネットを用いたモ
デル化を提案している．ベイジアンネットは複雑な依存関係にある対象をモデル化する手法の一つであり，その特徴の一つに
モデルに含まれる要素の値が不確実であっても予測が可能である点が挙げられる．

この[5] はこいつなんだけど CiteSeerX — A Critique of Software Defect Prediction Models

Most defect prediction studies are based on size and complexity metrics. The earliest such study appears to
have been Akiyama’s, [5], which was based on a system developed at Fujitsu, Japan. It is typical of many
regression based “data fitting” models which became common place in the literature.  The study showed that
linear models of some simple metrics provide reasonable estimates for the total number of defects D (the
dependent variable) which is actually defined as the sum of the defects found during testing and the defects
found during two months after release. Akiyama computed four regression equations.

Gompertz とか Logistic という語は見あたらないけれども、やってることはそれっぽい? またしても [5] な参考文献は [5] F. Akiyama, “An Example of Software System Debugging,” Information Processing, vol. 71, pp. 353-379, 1971. なんだけど…これひょっとして「情報処理」? 1971年というのもちょっと微妙かなあ。

東大理学部情報科学科／大学院情報理工学系研究科｜情報科学科NAVIgation