ときどきの雑記帖 RE* (新南口)

Death Speakers

September 6, 2023

DAPが壊れた

まあ高い機種ではなかったし何年も使ってたものだから… と言いつつも面白い(?)壊れ方をしてくれた。

ヘッドホンが断線したような聴こえ方になって、ああこりゃあ断線したかなと新しいヘッドホンに交換しても症状は変わらず。

ヘッドホンのコネクターをぐりぐりすると音が元に戻ったりもするけどちょっと本体を動かすとまた音がおかしい状態に。

ばらして中身を追いかければどうなっているかわかるかもしれないけどまあお役御免でいいかな。と。

ハシビロコウ

南武線車内(のモニター)で流れる日清紡のCMに登場する動物がハシビロコウさんに。これまで出てきたのは

マレーグマ
馬
カワウソ
クアッカワラビー

だったかな。

ピクミン

同じく南武線車内で流される任天堂のアレが今週分から山手線と同じピクミンに。なんで時期がずれてたんだろう?🤔

コンビニの中華まん

先週あたりからはじまっていた。

櫛の歯が

TVのスポーツ中継を観てたらアナウンサーが「櫛の歯が抜けたように…」と。

どんな櫛なんだ(笑)

そういう間違いをする人がいるのは知ってたけど、アナウンサーがやっちゃだめだろう。

士農工商

■ - NullPointer’s

を読んで、わたしらの若い頃は「士農工商犬プログラマ」とか言ってたよなあ。と思いだしたり。「(IT)エンジニア」と「プログラマー」は違う? ごもっとも。

士農工商犬プログラマーだと聞きましたが、本当ですか？ - そんな… - Yahoo!知恵袋

IT業界には、昔から

士農工商　犬　プログラマ

という身分制度があったんだよな。
しばらくすると、これに加えて

士農工商　犬　プログラマ　SE

という身分制度になったんだよな。www https://t.co/S8z4xe1lXU
— t157 (@t157) March 14, 2021

ちょっと脱線すると、個人的には工学の専門教育を受けた人だけが「エンジニア」を名乗るべきであって、たとえCSの学位を持っていてもそれだけでは「(IT)エンジニア」を名乗ってはいけないと思っている (ので、自分は「(IT)エンジニア」でございと自称したことはない(はず))。

まあその「専門教育」とは? という話もあるんだけどそこはそれ。

新刊近刊

セミコロン

なんか面白そうな本が

書籍『セミコロンかくも控えめであまりにもやっかいな句読点』8月30日発売 | リーディング | おすすめ英会話・英語学習の比較・ランキング- English Hub

太田道灌

太田道灌を主人公にした小説とな。

関東の大乱を駆け抜けた太田道灌を主人公に戦国の始まりを見極める｜関東の大乱を駆け抜けた太田道灌を主人公に、戦国の始まりを見極める｜tree

コードが動かないので帰れません！

お、先行販売が来たか。

【5階PC書】翔泳社より先行発売のお知らせ！
『コードが動かないので帰れません！　新人プログラマーのためのエラーが怖くなくなる本』
先行入荷致しました。
エラーの読みかた、デバッグのしかたがわかる！！
棚E10と新刊棚にございます。
ぜひいち早くご覧下さい♪es pic.twitter.com/ykG0kt17be
— 紀伊國屋書店新宿本店 (@KinoShinjuku) September 5, 2023

9/5先行販売：ISBN978-4-7981-8067-0 翔泳社『コードが動かないので帰れません！新人プログラマーのためのエラーが怖くなくなる本』桜庭洋之、望月幸太郎著　40冊入荷 pic.twitter.com/pn3vUitHvt
— ジュンク堂書店池袋本店 PC書担当 (@junkudo_ike_pc) September 5, 2023

Rules of programming

ルールズ・オブ・プログラミング
たしかに、確かに元の英文がおかしいのか翻訳がおかしいのか判断つかない文章だな。 pic.twitter.com/6Thsb3qGAI
— natsutan (@natsutan) September 6, 2023

これは原著も買っておくべき流れ?

Amazonでの書評も翻訳が心配

内容は良いと思いましたが、翻訳者による後書きが無視できないほどの悪文で翻訳が心配になりました。原著である The Rules of Programming: How to Write Better Code を読みたいと思います。

だしなあ。

August GNU Spotlight

8月分。今月は大物というかメジャーなものが多い?

August GNU Spotlight with Amin Bandali: Seventeen new GNU releases!

binutils-2.41
coreutils-9.4
emacs-29.1
gama-2.25
glibc-2.38
gmp-6.3.0
gnucobol-3.2
gnutls-3.8.1
gzip-1.13
less-643
lilypond-2.24.2
linux-libre-6.5-gnu
mpfr-4.2.1
octave-8.3.0
parallel-20230822
poke-3.3
screen-4.9.1

gnucobolというのをみて、そう言えばもうひとつ別のCOBOLコンパイラーのプロジェクトが立ち上がってなかったっけ? と調べると

COBOLコンパイラ「gcobol」が発表 | OSDN Magazine

gcobolの詳細やリポジトリはプロジェクトのWebサイトよりアクセスできる。

なんかリンク先にアクセスするとログイン(サインイン)を要求されるんですが。この記事が公開されたころは違ったよなあ。

gccのメーリングリストを調べればなにか関係する投稿が見つかるかもしれないけどめんｄ(ry

松竹梅

結局のところ

C.UTF-8だと松竹梅にならないけどja_JP.utf8 (Ubuntu)だとなるのでロケールの問題？ / JIS X 4061の4.4.11にJIS X 0208の区点番号の順番で並べたものとあったのでcanadieさんが正しそう

日本語文字列照合順番 Collation of japanese character string

4.4.11 漢字
(2) 基本漢字文字クラスこのクラスの漢字は，最小漢字文字クラスの5字をその順番で含み，その後ろにJIS X 0208に定義される漢字6355字をその区点番号の順番で並べたものとする。

(3) 拡張漢字文字クラスこのクラスの漢字は，最小漢字文字クラスの5字をその順番で含み，その後ろにJIS X 0221の“CJK統合漢字”に定義される漢字から“4EDD仝”を除いた20901字をその符号位置の順番で並べたものとする。

ということらしいけど、手元のUbuntuで試すと

kbk@toybox4:~$ echo 松竹梅
松竹梅
kbk@toybox4:~$ echo 松竹梅 | grep -o . | sort
松
梅
竹
kbk@toybox4:~$ echo $LC_COLLATE

kbk@toybox4:~$ echo $LC_ALL

kbk@toybox4:~$ echo $LANG
C.UTF-8
kbk@toybox4:~$ LC_COLLATE=ja_JP.UTF-8 echo 松竹梅 | grep -o . | sort
松
梅
竹

あれ?

kbk@toybox4:~$ sort --version
sort (GNU coreutils) 8.30
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Mike Haertel and Paul Eggert.

よく見たらUbuntu自体も20.04(22.04にしていたつもりだったのだが) なのでそれが原因?

でもそんなに最近にこれに影響するような変更・修正あったのだっけ?

ところで元記事では一文字ずつ切り出すのに

kbk@toybox4:~$ echo -n "竹梅松" | sed -ne 's/./\0\n/gp' | sort
松
梅
竹
kbk@toybox4:~$ LC_COLLATE=ja_JP.UTF-8 echo -n "竹梅松" | sed -ne 's/./\0\n/gp' | sort
松
梅
竹
kbk@toybox4:~$ LC_ALL=ja_JP.UTF-8 echo -n "竹梅松" | sed -ne 's/./\0\n/gp' | sort
松
梅
竹
kbk@toybox4:~$ LANG=ja_JP.UTF-8 echo -n "竹梅松" | sed -ne 's/./\0\n/gp' | sort
松
梅
竹

のようにsedを使っているのだけど、sコマンドのreplacement部にある\0が謎。ここに書けるのは1から9までで、正規表現全体にマッチした結果なら&と書くものだと思うんだけど GNU拡張とかだったりする?

kbk@toybox4:~$ echo -n "竹梅松" | sed -ne 's/./\0\n/gp'
竹
梅
松
kbk@toybox4:~$ echo -n "竹梅松" | sed -ne 's/./&\n/gp'
竹
梅
松

sed, a stream editor

3.3 The s Command

The replacement can contain \n (n being a number from 1 to 9, inclusive) references, which refer to the portion of the match which is contained between the nth ( and its matching ). Also, the replacement can contain unescaped & characters which reference the whole matched portion of the pattern space.

うん。1から9だよねえ。ということでソースコード execute.c\sed - sed.git - GNU stream editor や compile.c\sed - sed.git - GNU stream editor を見ると

sed/execute.c

static void append_replacement (struct line *buf, struct replacement *p,
                                struct re_registers *regs)
{

      int i = p->subst_id;
      enum replacement_types curr_type;

      /* Apply a \[lu] modifier that was given earlier, but which we
         have not had yet the occasion to apply.  But don't do it
         if this replacement has a modifier of its own. */
      curr_type = (p->repl_type & REPL_MODIFIERS)
        ? p->repl_type
        : p->repl_type | repl_mod;

      repl_mod = 0;
      if (p->prefix_length)
        {
          str_append_modified (buf, p->prefix, p->prefix_length,
                               curr_type);
          curr_type &= ~REPL_MODIFIERS;
        }

      if (0 <= i && i < regs->num_regs)
        {
          if (regs->end[i] == regs->start[i] && p->repl_type & REPL_MODIFIERS)
            /* Save this modifier, we shall apply it later.
               e.g. in s/()([a-z])/\u\1\2/
               the \u modifier is applied to \2, not \1 */
            repl_mod = curr_type & REPL_MODIFIERS;

          else if (regs->end[i] != regs->start[i])
            str_append_modified (buf, line.active + regs->start[i],
                                 regs->end[i] - regs->start[i],
                                 curr_type);
        }
    }
}

sed/compile.c

            switch (*p)
              {
              case '0': case '1': case '2': case '3': case '4':
              case '5': case '6': case '7': case '8': case '9':
                tail->subst_id = *p - '0';

//ざっくり略

      else if (*p == '&')
        {
          /* Preceding the ampersand may be some literal text: */
          tail = tail->next =
            new_replacement (base, (size_t)(p - base), repl_type);

          repl_type = save_type;
          tail->subst_id = 0;

なるほどこれなら\0は&と同じ動きになるけど、果たして意図的なものなのだろうか?

Ada

サンプルコードをみると

package Ada_Foo_Pack is

   function Ada_Foo return Integer;

private
   pragma Export
      (Convention    => C,
       Entity        => Ada_Foo,
       External_Name => "ada_foo");
end Ada_Foo_Pack;

:::ada
package body Ada_Foo_Pack is

   Ultimate_Unswer : Integer := 0;

   function Ada_Foo return Integer is
   begin

      return Ultimate_Unswer;

   end Ada_Foo;

begin

   --  This line of code will run during elaboration
   --
   Ultimate_Unswer := 42;

end Ada_Foo_Pack;

なるほどこうやってCで書いた(あるいはCから呼ばれるように書かれた) ルーチンをこうやって呼べるようにする(ABIを指定する) のか。まあ「方言」かもしれないけど。

FORTRAN Compiler on IBM 704

order

それに、ちょっと動作を付けたそうと思ったら、とたんにハマりました。

Bashでの[a-z]みたいな正規表現での大文字小文字について https://t.co/57aMfa0rZV
— はけた＠できるExcel2021 (@excelspeedup) September 3, 2023

参照先の記事 Bashでの[a-z]みたいな正規表現での大文字小文字についてでも解決してないけど (workaroundで対処)、これもlocaleの問題だよねえ。

Bash Reference Manual

case

The syntax of the case command is:
   case word in
       [ [(] pattern [| pattern]…) command-list ;;]…
   esac
case will selectively execute the command-list corresponding to the first pattern that matches word. The match is performed according to the rules described below in Pattern Matching. If the nocasematch shell option (see the description of shopt in The Shopt Builtin) is enabled, the match is performed without regard to the case of alphabetic characters. The ‘|’ is used to separate multiple patterns, and the ‘)’ operator terminates a pattern list. A list of patterns and an associated command-list is known as a clause.

Bash Reference Manual 3.5.8.1 Pattern Matching

[…]

Matches any one of the enclosed characters. A pair of characters separated by a hyphen denotes a range expression; any character that falls between those two characters, inclusive, using the current locale’s collating sequence and character set, is matched. If the first character following the ‘[’ is a ‘!’ or a ‘^’ then any character not enclosed is matched. A ‘-’ may be matched by including it as the first or last character in the set. A ‘]’ may be matched by including it as the first character in the set. The sorting order of characters in range expressions, and the characters included in the range, are determined by the current locale and the values of the LC_COLLATE and LC_ALL shell variables, if set.

For example, in the default C locale, ‘[a-dx-z]’ is equivalent to ‘[abcdxyz]’. Many locales sort characters in dictionary order, and in these locales ‘[a-dx-z]’ is typically not equivalent to ‘[abcdxyz]’; it might be equivalent to ‘[aBbCcDdxYyZz]’, for example. To obtain the traditional interpretation of ranges in bracket expressions, you can force the use of the C locale by setting the LC_COLLATE or LC_ALL environment variable to the value ‘C’, or enable the globasciiranges shell option.

サンボ

秋葉原裏通りにある「牛丼専門サンボ」、神保町に2店舗目を進出することで話題となっていますが、よく見ると「さらに、あと2～3店舗検討中です！」と書かれています、法人化以降、X(ツイッター)の開設など積極的な展開を見せるサンボですが、以前からの目標でもある店舗拡大についに動き出しました pic.twitter.com/VQAi9vDO2Y
— マウス (@mouseunit) September 4, 2023

マジですか

≪ prev Get 9

next ≫ 遥かなるオーガスタ