例によって「抄訳」なんですね。
規模の大きい本番システムをGo言語で書き直した感想 - ワザノバ | wazanova
Go言語の4周年をテーマにしたgolang.orgのブログで紹介されていた、GoogleのMobile Web Performanceチームに所属する
Matt Welshのブログです。大規模な本番システムの作り直しにGo言語を採用した経験を語っています。
1) 背景
C++のオリジナルのコードベースは問題なく作動していたが、何年も複数の目的の違うプロジェクトで共有されていたた
め、スピーディーに改修するのが難しくなっていた。(何のシステムなのか具体的に書いてないのは残念。。)
イメージフォーマットをトランスコードするライブラリはC++で完璧に動作していたので、そのまま残し、それ以外を全
てGo言語で書き直した。
元のコードベースの20%を利用すれば機能としては十分だとわかり、コアロジックの大胆な改修にも手をつけたかった。
Volatile and Decentralized: Rewriting a large production system in Go
Rewriting a large production system in Go
My team at Google is wrapping up an effort to rewrite a large production system (almost) entirely in Go. I say
"almost" because one component of the system -- a library for transcoding between image formats --
works perfectly well in C++, so we decided to leave it as-is. But the rest of the system is 100% Go, not just
wrappers to existing modules in C++ or another language. It's been a fun experience and I thought I'd share
some lessons learned.
Plus, the Go language has a cute mascot ... awwww!
Why rewrite?
The first question we must answer is why we considered a rewrite in the first place. When we started this
project, we adopted an existing C++ based system, which had been developed over the course of a couple of years
by two of our sister teams at Google. It's a good system and does its job remarkably well. However, it has been
used in several different projects with vastly different goals, leading to a nontrivial accretion of cruft.
Over time, it became apparent that for us to continue to innovate rapidly would be extremely challenging on this
large, shared codebase. This is not a ding to the original developers -- it is just a fact that when certain
design decisions become ossified, it becomes more difficult to rethink them, especially when multiple teams are
sharing the code.
Before doing the rewrite, we realized we needed only a small subset of the functionality of the original system
-- perhaps 20% (or less) of what the other projects were doing with it. We were also looking at making some
radical changes to its core logic, and wanted to experiment with new features in a way that would not impact the
velocity of our team or the others using the code. Finally, the cognitive burden associated with making changes
to any large, shared codebase is unbearable -- almost any change required touching lots of code that the
developer did not fully understand, and updating test cases with unclear consequences for the other users of the
code.
So, we decided to fork off and do a from-scratch rewrite. The bet we made was that taking an initial productivity
hit during the initial rewrite would pay off in droves when we were able to add more features over time. It has
also given us an opportunity to rethink some of the core design decisions of our system, which has been extremely
valuable for improving our own understanding of its workings.
まずはここ。大部分量が違うので情報が結構抜け落ちてるのはまあ仕方ないにしても
「元のコードベースの20%を利用すれば」と言う書き方だと、元プロジェクトのコードそのものを使うってことに
なりませんでしょうか。この20%云々は
「we needed only a small subset of the functionality of the original system
-- perhaps 20% (or less) of what the other projects were doing with it.」
small subset の規模ですよね。そもそもの話がGo で全面的に書き直した
(So, we decided to fork off and do a from-scratch rewrite.) なんだから。
2) Why Go?
当初不安だったことがいくつかあった。この本番システムは、ユーザとそのコンテンツの橋渡しの役割をもってるので、
早くなくてはいけない。また、大量のクエリを扱うのでCPU/メモリ効率が鍵となる。Go言語がガベージコレクタに頼って
ることは、メモリ容量のコントロールに苦労するのではないかという不安を抱いた。このシステムでは依存関係が多いの
で、C++で完成しているたくさんのライブラリを全部Goで書き直さなくてはいけないことも負担であった。
しかし、コアシステムをGo言語で書き直す最初のコードを見たとき、優秀なエンジニアが1週間以内でやった作業であっ
たが、Go言語の可読性に感心した。C++の場合、数十のソースファイルにまたがった非同期コールバックのチェーンが、
goroutineのおかげで、数百行のコードで一つのファイルにまとめることができた。また、
HTTP/URL処理/ソケット/暗号化/日付 & タイムスタンプの処理/データ圧縮など、web開発に適した標準ライブラリが
揃っている。しかも、コンパイル言語なので処理が早く、これは大型のプロジェクトではメリットが大きい。Goのモジュ
ールデザインでは、コードがモジュール間でうまくわかれ、依存関係がわかりやすい。
Why Go?
I'll admit that at first I was highly skeptical of using Go. This production system
sits directly on the serving path between users and their content, so it has to be
fast. It also has to handle a large query volume, so CPU and memory efficiency are key.
Go's reliance on garbage collection gave me pause (pun intended ... har har har),
given how much pain Java developers go through to manage their memory footprint. Also,
I was not sure how well Go would be supported for the kind of development we wanted to
do inside of Google. Our system has lots of dependencies, and the last thing I wanted
was to have to reinvent lots of libraries in Go that we already had in C++. Finally,
there was also simply the fear of the unknown.
My whole attitude changed when Michael Piatek (one of the star engineers in the group)
sent me an initial cut at the core system rewrite in Go, the result of less than a
week's work. Unlike the original C++ based system, I could actually read the code,
even though I didn't know Go (yet). The #1 benefit we get from Go is the lightweight
concurrency provided by goroutines. Instead of a messy chain of dozens of asynchronous
callbacks spread over tens of source files, the core logic of the system fits in a
couple hundred lines of code, all in the same file. You just read it from top to
bottom, and it makes sense.
Michael also made the observation that Go is a language designed for writing Web-based
services. Its standard libraries provide all of the machinery you need for serving
HTTP, processing URLs, dealing with sockets, doing crypto, processing dates and
timestamps, doing compression. Unlike, say, Python, Go is a compiled language and
therefore very fast. Go's modular design makes for beautiful decomposition of code
across modules, with clear explicit dependencies between them. Its incremental
compilation approach makes builds lightning fast. Automatic memory management means
you never have to worry about freeing memory (although the usual caveats with a
GC-based language apply).
「Our system has lots of dependencies, and the last thing I wanted
was to have to reinvent lots of libraries in Go that we already had in C++.」
が
「C++で完成しているたくさんのライブラリを全部Goで書き直さなくてはいけないことも負担であった。」
はちと。「負担だった」はどこから?
原文の一つ前の文を見ても
「I was not sure how well Go would be supported for the kind of development we wanted to do inside of Google.」
だし。
で、この段落は
I was highly skeptical、
gave me pause 、
I was not sure、
ときて「Finally, there was also simply the fear of the unknown.」
なんだから、負担という単語を持ち出すにしても負担が大きいことも不安要素だったくらいに
なるんじゃないかなあ。
訳文の後段も
「しかし、コアシステムをGo言語で書き直す最初のコードを見たとき、優秀なエンジニアが1週間以内でやった作業であっ
たが、Go言語の可読性に感心した。」
は元が
「My whole attitude changed when Michael Piatek (one of the star engineers in the group)
sent me an initial cut at the core system rewrite in Go, the result of less than a
week's work.」で、ここには可読性なんて出てこないんですよね。出てくるのは次の文で
「Unlike the original C++ based system, I could actually read the code,
even though I didn't know Go (yet).」
この二つを混ぜて(それ自体は良いとしても)、「可読性に感心した」が出てくるのはやっぱりわからない。
あと Michael Piatek さん名前消されて気の毒。
さらに、
「C++の場合、数十のソースファイルにまたがった非同期コールバックのチェーンが、
goroutineのおかげで、数百行のコードで一つのファイルにまとめることができた。」
(読点の使い方なんかも一言言いたいけどスルーして)についても
原文の「lightweight concurrency provided by goroutines.」
から goroutine だけ取り出すのはGoを知らない人には不親切でしょう。
原文の三段落目に対応する文は…なんというか
「Michael also made the observation」というのがどっか行っちゃって
3) Being terse
長い変数名で何行にも渡ってコードを書くのに慣れていたので、Go言語の簡潔なスタイルに最初は戸惑ったが、結果、
可読性があがり、スピードアップしたので感謝している。雛形構文をたくさん書かなくてもいいし、C++のようにヘッダ
ーファイルとccファイルにロジックを分けなくてもよい。また、Javaと違って、コンパイラーが推察できるもの(変数
のタイプとか)を何でも書いたりする必要がない。Pythonのような簡潔なスクリプト言語を書いてる感覚で、タイプセ
ーフを実現できます。
最終的には、Go言語で、21,000行、121ファイルのコードベスになった。オリジナルのC++は、460,000行、1,400ファイル。
もちろんこの差分は、機能を一部に絞り書き直した結果であるが、機能の削減以上にコードは減ったという感覚がある。
Being terse
Syntactically, Go is very succinct. Indeed, the Go style guidelines encourage you to
write code as tersely as possible. At first this drove me up the wall, since I was
used to using long descriptive variable names and spreading expressions over as many
lines as possible. But now I appreciate the terse coding approach, as it makes reading
and understanding the code later much, much easier.
Personally, I really like coding in Go. I can get to the point without having to write
a bunch of boilerplate just to make the compiler happy. Unlike C++, I don't have to
split the logic of my code across header files and .cc files. Unlike Java, you don't
have to write anything that the compiler can infer, including the types of variables.
Go feels a lot like coding in a lean scripting language, like Python, but you get type
safety for free.
Our Go-based rewrite is 121 Go source files totaling about 21K lines of code
(including comments). Compare that to the original system, which was 1400 C++ source
files with 460K lines of code. (Remember what I said about the new system implementing
a small subset of the new system's functionality, though I do feel that the code size
reduction is disproportionate to the functionality reduction.)
「コンパイラーが推察できる」「compiler can infer」は「型推論(type inference)」を考えれば
推論の方がいいんじゃないすかね。で、
「また、Javaと違って、コンパイラーが推察できるもの(変数のタイプとか)を何でも書いたりする必要がない。」
「Unlike Java, you don't have to write anything that the compiler can infer, including the types of variables. 」
これは「Java とは違って、変数の型のようにコンパイラーが推論可能なものは(一切)書く必要がない。」
くらいでいいのでは。
「スピードアップしたので感謝している。」
これはどこから出てきたんだろう…
「感謝している」だから
But now I appreciate the terse coding approach, as it makes reading
and understanding the code later much, much easier. この辺?
4) What about ramp-up time?
標準ライブラリのドキュメントやオンラインチュートリアルが充実しているので、C言語系の経験がある人にはすぐに
キャッチアップできる。
今回のプロジェクトは、書き直しと、3 or 4 件の新機能追加で、全部で5ヶ月。新しいコードベースへの移行で生産
性は大幅アップ。
What about ramp-up time?
Learning Go is easy coming from a C-like language background. There are no real
surprises in the language; it pretty much makes sense. The standard libraries are very
well documented, and there are plenty of online tutorials. None of the engineers on
the team have taken very long at all to come up to speed in the language; heck, even
one of our interns picked it up in a couple of days.
Overall, the rewrite has taken about 5 months and is already running in production. We
have also implemented 3 or 4 major new features that would have taken much longer to
implement in the original C++ based system, for the reasons described above. I
estimate that our team's productivity has been improved by at least a factor of ten by
moving to the new codebase, and by using Go.
「C言語系の経験がある人にはすぐにキャッチアップできる。」は
端折りすぎというかまとめすぎというか。
「None of the engineers on the team have taken very long at all to come up to speed in the language;
heck, even one of our interns picked it up in a couple of days.」
せめてチームではそうだったということで「キャッチアップできた」にはすべきかと。
「that would have taken much longer to implement in the original C++ based system,
for the reasons described above. I estimate that our team's productivity has been improved by
at least a factor of ten by moving to the new codebase, and by using Go.」
が
「新しいコードベースへの移行で生産性は大幅アップ。」
はなんでそうなるのという感じ。
そして今回のメイン
5) Why not Go?
Goで苦労していることは、
まず、自分が扱っている変数がinterfaceかstructか理解してなくてはいけない。もちろんstructはinterfaceをインプ
リできるので、一般的には同じものと扱える。しかし、structを扱っているとき、*myStructというタイプの参照を渡さ
れるかもしれないし、mStructというタイプの変数を渡されるかもしれない。一方で、もし単なるinterfaceを扱ってい
たら、それはポインタをもたない。ある意味interfaceがポインタだから。structでなくinterfaceだった場合、* をつ
けないで渡しているコードを見て実はポインタかもしれないと思わなければいけないことで混乱するかもしれない。
Goのタイプを推察するスタイルは簡潔なコードになるが、どのタイプの変数なのかが明示されてなければ、調べなくて
はいけなくなる。例えば、foo, bar := someFunc(baz) として、もしfooとbarを操作するためにコードを書き足したい
のであれば、fooとbarが何なのか確認しなくてはいけない。(IDEでできるのかも知れないが、コードを書くときにマウ
スを使うのはイヤ。)
また、意図せずにstructがinterfaceをインプリしてしまうことがある。structがどのinterfaceをインプリするのか明
示する必要がない。これはコードにコメントを残しておくべきだが、他人のコードを読んだときに意図がわかりづらく
なる。また、interfaceをリファクタリングするときに宣言されてないインプリも含めて全て確認しなくてはいけなくなる。
Why not Go?
There are a few things about Go that I'm not super happy about, and that tend to bite
me from time to time.
First, you need to "know" whether the variable you are dealing with is an
interface or a struct. Structs can implement interfaces, of course, so in general you
tend to treat these as the same thing. But when you're dealing with a struct, you
might be passing by reference, in which the type is *myStruct, or you might be passing
by value, in which the type is just myStruct. If, on the other hand, the thing you're
dealing with is "just" an interface, you never have a pointer to it -- an
interface is a pointer in some sense. It can get confusing when you're looking at code
that is passing things around without the * to remember that it might actually
"be a pointer" if it's an interface rather than a struct.
Go's type inference makes for lean code, but requires you to dig a little to figure
out what the type of a given variable is if it's not explicit. So given code like:
foo, bar := someFunc(baz)
You'd really like to know what foo and bar actually are, in case you want to add some
new code to operate on them. If I could get out of the 1970s and use an editor other
than vi, maybe I would get some help from an IDE in this regard, but I staunchly
refuse to edit code with any tool that requires using a mouse.
Finally, Go's liberal use of interfaces allows a struct to implement an interface
"by accident". You never have to explicitly declare that a given struct
implements a particular interface, although it's good coding style to mention this in
the comments. The problem with this is that it can be difficult to tell when you are
reading a given segment of code whether the developer intended for their struct to
implement the interface that they appear to be projecting onto it. Also, if you want
to refactor an interface, you have to go find all of its (undeclared) implementations
more or less by hand.
Most of all I find coding in Go really, really fun. This is a bad thing, since we all
know that "real" programming is supposed to be a grueling, painful exercise
of fighting with the compiler and tools. So programming in Go is making me soft. One
day I'll find myself in the octagon ring with a bunch of sweaty, muscular C++
programmers bare-knuckling it out to the death, and I just know they're going to mop
the floor with me. That's OK, until then I'll just keep on cuddling my stuffed gopher
and running gofmt to auto-intent my code.
まず、見出しに「Why not Go?」とあって、本文も最後の段落に
「Most of all I find coding in Go really, really fun. 」とあるのに、
なんで
「Goで苦労していることは、」
で片付けるかなー
ここの訳文もツッコミどころ満載なんだけどいい加減疲れたので止めます
(といいつつ明日辺り蒸し返すかもしれない :)