<![CDATA[app-engine - manbetx手机版登陆]]> app-engine - manbetx手机版登陆 http://www.djindien.com/ 幽灵0.11 星期五,2018年11月2日09:36:42 GMT 60 <![CDATA [在Google App Engine上使用3年史诗评论。]]> <p>在过去的3年里,我参与了在Google App Engine上运行的应用程序这是谷歌在这里提供的一项迷人而独特的服务不像你在其他地方找到的任何东西This is my in-depth, personal take on it.</p> <p><img src="http://www.djindien.com/content/images/2017/01/google-app-engine-logo.png" alt=""></p> <h2 id="googlescloudest2008">Google's Cloud (est2008)</h2> <p>First of all, what is</p> http://www.djindien.com/3-years-on-google-app-engine-an-epic-review/ ce13c232-987c-4382-ba01-48b806ac99cd 托管 java的 数据存储 谷歌 应用引擎 manbetx万博体育 2017年3月13日星期一15:23:59 GMT <p>在过去的3年里,我参与了在Google App Engine上运行的应用程序这是谷歌在这里提供的一项迷人而独特的服务不像你在其他地方找到的任何东西This is my in-depth, personal take on it.</p> <p><img src="http://www.djindien.com/content/images/2017/01/google-app-engine-logo.png" alt=""></p> <h2 id="googlescloudest2008">Google's Cloud (est2008)</h2> <p>First of all, what is <a href="https://cloud.google.com/appengine/docs">Google App Engine</a> (GAE) actually? It is a platform to run your web applications on与<a href="https://www.heroku.com"> Heroku </a>一样但是当你仔细观察时会有所不同它也是一个多功能的云计算平台与<a href="https://aws.amazon.com"> AWS </a>一样但不同Let me explain.</p> <p>Google launched GAE in 2008, when cloud computing was still in its infancy亚马逊领先于他们,因为他们已经开始在2006年租用他们的IT基础设施但是对于GAE,谷歌很早就提供了一种复杂的平台即服务(PaaS),亚马逊将在2011年与其Elastic Beanstalk服务相匹配Now what is so special about GAE?</p> <p>It is a <em>fully-managed</em> application platformSo far, I do not know a platform which comes close to GAE's full package: log management, mail delivery, scaling, memcache, image manipulation, distributed Cron jobs, load balancing, version management, task queue, search, performance analysis, cloud debugging, content delivery network - and that is not even mentioning auxiliary services that have popped up on Google's cloud in the meantime like SQL, BigQuery, file storage..名单还在继续</p> <p>By using Google App Engine, you can run your app on top of (probably) the world's best infrastructure此外,您还可以获得开箱即用的功能,这些功能将至少需要Heroku上的第三方附加组件,或者如果您自己完成,则需要几周的安装时间<em>This</em> is GAE's appeal.</p> <p>Noteworthy applications that run on GAE include <a href="https://www.snapchat.com">Snapchat</a> and <a href="https://www.khanacademy.org">Khan Academy</a>.</p> <h2 id="development">Development</h2> <p>The web app I was working on all this time is a single, large Java applicationApp Engine还支持Python,PHP和Go现在您可能想知道为什么选择如此有限一个原因是,为了拥有一个完全托管的环境,Google需要将平台与环境集成你可以说环境和平台是紧密耦合的That takes a lot of effort and investment which becomes very clear once you start developing for GAE.</p> <h3 id="sdk">SDK</h3> <p>Each app needs to use a special SDK (Software Development Kit) to use the APIs offered by GAESDK非常庞大例如,Java SDK下载大约为190 MBGranted, some of the JARs in there are not needed for most use cases and some only during development - but still, it certainly is not lightweight (even for Java, that is).</p> <p>The SDK is not just your bridge to the world of Google App Engine but also serves as its simulation on your local machine对于几乎所有GAE API,它都具有您可以开发的存根首先,这意味着当您在本地运行应用程序时,您将<em>完全</ em>接近它在生产中的行为方式其次,您可以轻松地针对API编写集成测试And usually this will get you very far; the mismatch between the production and stub behavior is quite small.</p> <h3 id="javaapis">Java APIs</h3> <p>Speaking of APIs, you are in for a surprise when you use certain Java APIsSince GAE runs your application in some kind of <a href="https://cloud.google.com/appengine/docs/java/runtime?csw=1#The_Sandbox">sandbox</a>, it forbids using particular Java APIs主要限制包括写入文件系统,<code> java.lang.System </ code>的某些方法以及使用Java Native Interface(JNI)使用线程和套接字也有一些特点,但稍后会有更多内容</p> <p>One interesting thing is that the Java SDK actually ensures you do not use these restricted APIs locally当您运行应用程序或仅进行集成测试时,它会使用一个Java代理来监视您的每个方法调用它会立即针对任何检测到的违规行为抛出异常这有助于及早发现违规行为,不仅仅是在生产中,而且还有令人讨厌的副作用在分析应用程序的性能时,代理会进行大量的违规检查In the end, it is hard to judge your app's actual performance since the more method calls you make, the more overhead the agent generates.</p> <h3 id="javadevelopmentkitjdk">Java Development Kit (JDK)</h3> <p>The next thing you might notice when you start developing is that you can <em>not</em> use Java 8尽管Java 7的生命终结时间是在2015年,但它仍然非常活跃并且在GAE上崭露头角The third highest voted issue on <a href="https://code.google.com/p/googleappengine/issues">GAE's issue tracker</a> is <a href="https://code.google.com/p/googleappengine/issues/detail?id=9537">support for Java 8</a> (the second highest is support for Python 3)它创建于2013年从那以后,关于此事的任何进展的唯一新闻是2016年App Engine邮件列表上的帖子,说明工程师正积极致力于此Well, good for you.</p> <p>Obviously, this limitation is a major annoyance for any developer就我个人而言,失踪的lambda支撑非常重当然,可以迁移到许多JVM语言之一,如Groovy,Scala或Kotlin,它们都提供了比Java 8更多的功能。但这是一项代价高昂且风险大的投资我们的项目成本太高,风险太大We also investigated the feasibility of <a href="https://github.com/orfjackal/retrolambda">retrolambda</a>, a backport of lambdas to Java 7, but did not pursue it yet although it looked promising in first tests.</p> <p>Having to stay with an old version is also a liability for the business这使得找到开发人员变得更加困难整体应用程序安全性也受到威胁谷歌支持告诉我们,我们仍然会收到生产JDK 7的安全补丁But eventually, all major libraries like Spring will stop supporting itEventually, you'll be <em>stuck</em>.</p> <h2 id="deployment">Deployment</h2> <p>To deploy your application, you need to create an <code>appengine-web.xml</code> configuration file在那里,您可以指定应用程序ID和版本以及一些其他设置,例如marking the app as <code>threadsafe</code> to be able to receive multiple requests per instance simultaneously.</p> <h3 id="upload">Upload</h3> <p>App Engine expects to receive your Java application as a packaged WAR file您可以使用SDK中的<code> appcfg </ code>脚本将其上传到服务器(可选)Maven和Gradle的插件可以像编写<code> mvn appengine一样简单:update </ code>对于典型的Java应用程序,上传可以<em>完全</ em>一段时间,您最好有一个快速的互联网连接Once the process finishes, you can see your newly deployed version in the Google Cloud Console:</p> <p><img src="http://www.djindien.com/content/images/2017/01/Screen-Shot-2017-01-22-at-15.15.45-2.png" alt="Google Cloud Console - Versions"></p> <h3 id="staticfiles">Static Files</h3> <p>Static files like images, stylesheets and scripts are part of any web application today在<code> appengine-web.xml </ code>文件中可以标记为静态Google将直接提供这些文件 - 无需点击您的应用程序It is not <em>exactly</em> a Content Delivery Network (CDN) since it is not distributed to hundreds of edge nodes, but it helps to reduce the load on your servers.</p> <h3 id="versions">Versions</h3> <p>The nice thing in App Engine is that everything you deploy has a specific version可以通过<code> https://&lt; version&gt; -dot-&lt; app-id&gt; .appspot.com </ code>访问每个版本But which one is <em>actually</em> live?</p> <p>You can mark a version as <code>default</code>这意味着当您转到<code> https://&lt; app-id&gt; .appspot.com </ code>(或您为该应用指定的域名)时,这将是接收所有请求的版本将版本切换为<code> default </ code>非常简单:只需按一下按钮或简单的终端命令即可GAE can switch immediately or migrate your traffic incrementally to prevent overwhelming the new version.</p> <p>There is also one option (which we never used) that allows you to distribute your traffic across multiple versionsThis allows incrementally rolling out a new version by only giving it to a fraction of the user base before making it available for everyone.</p> <p>Since it is so easy to create new versions and switch production traffic between them, GAE is a perfect platform to practice <a href="https://martinfowler.com/bliki/BlueGreenDeployment.html">blue-green deployment</a>每次我们由于新版本中的错误而需要回滚时,它都是毫不费力的通过编写一个有点智能的部署脚本,也可以实现持续交付</p> <h3 id="instances">Instances</h3> <p>Every version can run any number of instances (the only limit is your credit card)The actual number is the result of incoming traffic and the scaling configuration of your app; we'll look at that laterGoogle将在该版本的所有正在运行的实例之间分发传入请求You can see a list of instances, including some basic metrics like requests and latency, in the Google Cloud Console:</p> <p><img src="http://www.djindien.com/content/images/2017/01/Screen-Shot-2017-01-22-at-15.28.57.png" alt="Google Cloud Console - Instances"></p> <p>The hardware options you can choose from to run these instances on are - let's be frank here - patheticApp Engine basically offers four different <a href="https://cloud.google.com/appengine/docs/about-the-standard-environment#instance_classes">instance classes</a> ranging from 128MB and 600MHz CPU (you read that correctly) to 1024MB and 2.4GHz CPU是的,再一次,这是事实而且真的很伤心On a developer's laptop our app started almost twice as fast as in production.</p> <h3 id="services">Services</h3> <p>So far, I have only talked about a single, monolithic applicationBut what do you do if yours consists of multiple services? App Engine has got you covered每个应用程序都是一项服务如果您只有一个,则只需将其命名为<code> default </ code>You can access each one directly via <code>https://&lt;version&gt;-dot-&lt;service&gt;-dot-&lt;app-id&gt;.appspot.com</code>.</p> <p><img src="http://www.djindien.com/content/images/2017/02/Screen-Shot-2017-02-20-at-21.36.57.png" alt="App Engine Application, Service, Version and Instance"></p> <p>You can easily deploy multiple versions of each service, scale and monitor them separatelyAnd since each service is separate from the others, you could run any combination of the supported languages不幸的是,一些配置设置在所有服务中共享因此,它们并非完全孤立总而言之,GAE似乎非常适合微服务Google也提供了有关此主题的<a href="https://cloud.google.com/appengine/docs/java/microservices-on-app-engine">详细文档</a>。</p> <p>For reasons that will become clear later, we decided to separate our application into two services: frontend (user-facing) and backend (background work)但要做到这一点,我们实际上并没有将这块巨石分成两部分 - 这可能需要几个月的时间We simply deployed the same app twice and only sent users to one service and background work to the other.</p> <h2 id="operations">Operations</h2> <p>Let's talk about what it means to <em>run</em> your application on App Engine正如您将看到的,它对您施加了许多限制但并非所有人都感到沮丧In the end you will understand why.</p> <h3 id="applicationstartup">Application Startup</h3> <p>When App Engine starts a new instance, the app needs to initialize它将直接从用户发送HTTP请求到应用程序,或者 - 如果配置和扩展情况允许 - 发送所谓的预热请求Either way, the first request is called a loading requestAnd as you can imagine, starting quickly is important.</p> <p>The instance itself on the other hand is ridiculously fast to start如果您之前在云中启动过服务器,则可能需要等待一分钟以上不在GAE上实例几乎立即开始我猜谷歌拥有一批准备好的服务器瓶颈将始终是您自己的应用程序我们的应用程序开始生产需要40多秒因此,除非我们想要将我们庞大的巨型组合分成不同的服务,否则我们需要它更有效地开始</p> <p>The app uses SpringGoogle even has a dedicated documentation entry just for that: <a href="https://cloud.google.com/appengine/articles/spring_optimization">Optimizing Spring Framework for App Engine Applications</a>There we found the inspiration for our most important startup optimization.</p> <p>We got rid of Spring's classpath scanning它在App Engine上特别慢(可能是因为CPU很糟糕)幸运的是,有一个名为<a href="https://github.com/atteo/classindex"> classindex </a>的库它使用特殊注释将类的完全限定路径写入文本文件By simply reading the beans from the text file, the Spring initialization went down by about 8-10 seconds.</p> <h3 id="requesthandling">Request Handling</h3> <p>The very first thing I have to mention here is the requirement of the App Engine to handle a user request within 60 seconds and a background request in 10 minutesWhen the application takes too long to respond, the request is aborted with a 500 status code and a <code>DeadlineExceededException</code> is thrown.</p> <p>Usually, this shouldn't be a problem如果您的应用需要超过60秒的时间来响应,那么用户很快就会消失但是,由于实例是通过HTTP请求启动的,这也意味着它必须在60秒内启动在生产中,我们观察到启动时间的变化长达10秒这意味着您现在只需不到50秒即可启动应用It is not uncommon for a Java app to take that long.</p> <p>One nice little feature I'd like to highlight is the geographical HTTP headers: for each incoming user request, Google adds headers that contain the user's country, region, city as well as latitude and longitude of said city这可以<em>非常</ em>有用,例如用于预先填写电话号码国家/地区代码或检测异常帐户登录位置从我们的观察来看,准确性似乎也很高从第三方API或数据库获得具有此级别准确性的那种信息通常非常麻烦和/或昂贵So getting it for free on App Engine is a nice bonus.</p> <h3 id="backgroundwork">Background Work</h3> <h4 id="threads">Threads</h4> <p>As mentioned earlier, there are restrictions using Java threads虽然可以启动一个新线程,虽然通过自定义GAE <code> ThreadManager </ code>,但它不能“超过”它在这在实践中可能很烦人,因为第三方库当然不遵循App Engine的限制为了找到一个兼容的图书馆或改编一个看似不相容的图书馆,这些年来我们花了很多汗水和眼泪For example, we could not use the <a href="https://github.com/dropwizard/metrics">Dropwizard metrics</a> library out of the box since it relies on using a background thread.</p> <h4 id="queue">Queue</h4> <p>But there are other ways of doing background work: In the spirit of the Cloud, you apply the divide and conquer approach on the instance level通过使用<a href="https://cloud.google.com/appengine/docs/java/taskqueue/">任务队列</a>,您可以将工作排入队列以便以后处理例如,当需要发送电子邮件时,您可以使用有效负载排队新任务(例如,收件人,主题和正文)以及<em>推送</ em>队列中的URL然后,您的一个实例将接收有效负载作为对指定端点的HTTP POST请求If it fails, App Engine will retry the operation.</p> <p>This pattern really shines when you have a lot of work to process简单地将一批独立运行的任务排入队列App Engine将负责故障处理无需自定义重试代码Just imagine how awkward it would be without it: running hundreds of tasks at once you either need to stop and start from scratch when an error occurs or carefully track which have failed and enqueue them again for another attempt.</p> <p>And just like the rest of the App Engine, task queues scale beautifully队列可以接收几乎无限的任务缺点是有效载荷最多只能达到1 MB但我们通常只是简单地将对数据的引用传递给队列But then, you need to take extra good care in your data handling since it can easily happen that something vanishes between the time you enqueue a task and the time that task is actually executed.</p> <p>The queues are configured in a <code>queue.xml</code> fileHere is an example of a push queue that fires up to one task per second with a maximum of two retries:</p> <pre><code class="language-xml">&lt;queue&gt; &lt;name&gt;my-push-queue&lt;/name&gt; &lt;rate&gt;1/s&lt;/rate&gt; &lt;retry-parameters&gt; &lt;task-retry-limit&gt;2&lt;/task-retry-limit&gt; &lt;/retry-parameters&gt; &lt;/queue&gt; </code></pre> <h4 id="cron">Cron</h4> <p>Another extremely valuable tool is the distributed Cron在<code> cron.xml </ code>中,您可以告诉GAE以特定时间间隔发出请求这些只是您的一个实例将收到的简单HTTP GET请求可能的最小间隔是每分钟一次It is very useful for regular reports, emails and cleanups.</p> <p>This is what an entry in <code>cron.xml</code> looks like:</p> <pre><code class="language-xml">&lt;cron&gt; &lt;url&gt;/tasks/summary&lt;/url&gt; &lt;schedule&gt;every 24 hours&lt;/schedule&gt; &lt;/cron&gt; </code></pre> <p>A Cron job can also be combined with <em>pull</em> queues: they allow to actively fetch a batch of tasks from a queueDepending on the use case, making an instance pull lots of tasks in a batch can be much more efficient than pushing them to the instance individually.</p> <p>Like all other App Engine configuration files, the <code>cron.xml</code> is shared across all services and versions of an application这可能很烦人在我们的示例中,有时当我们部署了添加了新Cron条目的版本时,App Engine会开始向实时(但较旧)版本上不存在的端点发送请求 - 为我们的生产错误报告生成噪音I imagine this must be even more painful when using App Engine to host microservices.</p> <p>Also, the Cron jobs are not run locally我可以理解为什么会这样:很多工作通常安排在通常繁忙的时间之外,因此甚至不会在正常工作日触发但是<em>一些</ em>每隔几分钟或几小时运行一次 - 这些都非常有趣例如,他们可能会触发通知你想在本地看到这些因为最终你会引入一个导致不良行为的变化(正如在我们的项目中多次发生的那样)并且在本地看到它可能会阻止你运送它但是在本地模拟Cron工作很棘手(不幸的是我们没有打扰)One would probably need to write an external tool that parses the <code>cron.xml</code> and then pings the according endpoints (yuck!).</p> <h3 id="scaling">Scaling</h3> <p>App Engine will take care of scaling the number of instances based on the trafficHow? Well, depending on how you have configured your applicationThere are three modes:</p> <ul> <li><strong>Automatic:</strong> This is GAE's unique selling point它将根据请求率和响应延迟等指标扩展实例数So if there is a lot of traffic or your app is slow to respond, more instances spin up.</li> <li><strong>Manual:</strong> Basically like your good old virtual private servers您告诉Google您想要多少个实例和Google提供的实例This fixed instance size is useful if you know <em>exactly</em> what traffic you are going to get.</li> <li><strong>Basic:</strong> Essentially the same as manual scaling mode but when an instance becomes idle, it is turned off.</li> </ul> <p>The most useful and interesting one here certainly is the <em>automatic mode</em>It has a few parameters that help to shed some light on how it works internally: <code>max_concurrent_requests</code>, <code>max_idle_instances</code>, <code>min_idle_instances</code> and <code>max_pending_latency</code>To quote the App Engine documentation:</p> <blockquote> <p>The App Engine scheduler decides whether to serve each new request with an existing instance (either one that is idle or accepts concurrent requests), put the request in a pending request queue, or start a new instance for that requestThe decision takes into account the number of available instances, how quickly your application has been serving requests (its latency), and how long it takes to spin up a new instance.</p> </blockquote> <p>Every time we tried to tweak those numbers, it felt like practicing black magic实际上很难在这里推断出一个好的设置Yet, these numbers determine the real-world performance of your app and hugely affect your monthly bill.</p> <p>But all in all, the automatic scaling is pretty wicked它特别适合处理背景工作(例如generating reports, sending emails) since it often - more so than user requests - comes in large, sudden bursts.</p> <p>But the thing is, Java is a terrible fit for this kind of auto scaling due to its slow startup time更糟糕的是,调度程序将请求分配给<em>启动</ em>(冷)实例是很常见的然后,进入亚秒级REST响应的所有努力都会消失自2012年以来,<a href="https://code.google.com/p/googleappengine/issues/detail?id=7865">面向用户的请求永远不会被锁定到冷实例</a>It has not even elicited the slightest comment by Google other than the status change to 'Accepted' (sounds like one of the stages of grief at this point).</p> <p>This also explains why we split our app into two services之前,我们经常发现,随着后台请求的激增,用户请求会受到影响这是因为App Engine极大地扩展了实例,并且由于请求在实例之间平均路由,因此导致更多用户请求命中冷实例通过拆分应用程序,我们大大减少了这种情况Also, we were able to apply different scaling strategies for the two services.</p> <p>One last thing: In a side-project, I used Go on App Engine and discovered a new perspective on the App EngineGo的特点是能够立即启动应用程序这使得App Engine和Go成为完美组合,如蝙蝠侠和罗宾从我了解它之后,它们一起体现了我个人对云的期望它真正适应工作负载,并且毫不费力地完成工作Not even the abysmal hardware options seemed to pose a real problem for Go since it is that efficient.</p> <h2 id="data">Data</h2> <p>When App Engine launched, the only database options you had were Google Datastore for structured data and Google Blobstore for binary data从那时起,他们又添加了Google Cloud SQL(托管MySQL)和谷歌云存储(如亚马逊的S3),取代了BlobstoreFrom the beginning App Engine offered a managed Memcache, as well.</p> <p>It used to be very difficult to connect to a third-party database since you could only use HTTP for communication但通常数据库需要原始TCP几年前,当Socket API发布时,这只是改变了但它在Beta中仍然<em> </ em>,这使得它成为任务关键型使用的可疑选择因此,在数据库方面,仍有很多供应商锁定</p> <p>Anyway, in the beginning, there was only the Datastore.</p> <h3 id="datastore">Datastore</h3> <p>The Datastore is a proprietary NoSQL database, fully managed by Google它不像我以前用过的任何东西It is a massively scaling beast with very unique traits, guarantees and restrictions.</p> <p>In the early days, the Datastore was based on a master-slave setup which featured strongly consistent readsA few years in, after it had suffered a few severe outtakes, Google <a href="http://googleappengine.blogspot.ca/2011/01/announcing-high-replication-datastore.html">introduced a new configuration option</a>: High ReplicationAPI保持不变,但写入延迟增加,一些读取变得<em>最终</ em>一致(稍后会详细介绍)The upside was the significantly increased availability它甚至拥有99.95%的正常运行时间SLA自从我使用它以来,我从未遇到过Datastore可用性的单个问题It was just something you did not have to think about.</p> <h4 id="entities">Entities</h4> <p>The basics of the Datastore are simple您可以读写<em>实体</ em>它们被归类为特定的<em>种</ em>实体由<em>属性</ em>组成属性具有名称和具有特定类型的值像<code> string </ code>,<code> boolean </ code>,<code> float </ code>或<code> integer </ code>Each entity also has a unique <em>key</em>.</p> <h4 id="writing">Writing</h4> <p>There is no schema whatsoever, though具有相同类型的实体可能看起来完全不同这使得开发非常简单:只需添加一个新属性,保存它就会存在另一方面,您需要编写自定义迁移代码来重命名属性原因是实体无法就地更新 - 必须再次加载,更改和保存根据实体的数量,这可能会成为一项非常重要的任务,因为您可能需要使用任务队列来规避请求时间要求In my experience, this leads to old property names all over the place since refactoring is so costly and dangerous.</p> <p>There are a some <a href="https://cloud.google.com/datastore/docs/concepts/limits">limits for working with entities</a>The two most critical are:</p> <ul> <li>An entity may only be 1MB in total, including additional meta data of the encoded entity</li> <li>You can only write to an entity (group, to be exact) up to once per second</li> </ul> <p>In practice, this can be an issue我们很少达到规模限制 - 但是当我们这样做时,它很痛苦客户数据可能会丢失When you hit the write rate limitation, it is usually fine on the next try但是,当然你必须设计你的应用程序,以尽量减少这种可能性For example, something like a regularly updated counter takes a lot of work to get rightGoogle even has a documentation entry on <a href="https://cloud.google.com/appengine/articles/sharding_counters">using sharding to build a counter</a>.</p> <h4 id="reading">Reading</h4> <p>An entity can be fetched by using its key or via a query按键读取非常一致,这意味着即使您在获取实体之前更新了实体,也会收到最新数据但是,查询不适用它们最终是一致的因此写入并不总是立即反映出来这可能导致问题并且可能需要减轻,例如通过巧妙的数据建模(例如,使用助记符作为密钥)或利用特殊的数据存储功能(例如entity groups).</p> <p>A query always specifies an entity kind and optional filters and/or sort orders必须为过滤器中使用的每个属性或作为排序键建立索引添加索引只能作为常规写入操作的一部分来完成不像大多数SQL数据库那样在后台自动运行该索引还将增加写入操作的时间和成本(稍后将详细介绍)</p> <p>If a query involves multiple properties, it requires a multi-index必须在名为<code> datastore-indexes.xml </ code>的配置文件中指定Here is an example:</p> <pre><code class="language-xml">&lt;datastore-index kind="Employee" ancestor="false"&gt; &lt;property name="lastName" direction="asc" /&gt; &lt;property name="hireDate" direction="desc" &lt;/datastore-index&gt; </code></pre> <p>In contrast to other databases, the absence of a multi-index will not just result in an inefficient, slow query - it will fail immediately数据存储区尽力执行高性能查询例如,不等式过滤器仅支持单个属性当然,总有办法在脚下射击自己 - 但它们很少见</p> <p>There are several other features I cannot go into now, for example pagination, projection queries and transactionsGo to the <a href="https://cloud.google.com/appengine/docs/standard/java/datastore/api-overview">Datastore documentation</a> to learn more, it is very extensive and helpful.</p> <p>Compared to other databases the read and write operations are very slow根据我的观察,按键读取平均需要10-20ms很少见到重大偏差My best guess is that Google serializes entities and only indexes are actually kept in memory</p> <p>The pricing model seems to support that: you pay for stored data, read, write and delete operationsThat's it请注意,数据库内存不在该列表中操作本身也很便宜:读取100k实体成本为0.06美元,10万次写入操作成本为0.18美元 - 写入操作可以是实际的实体写入,但也可以写入每个索引如果你不写任何东西,你就不付任何代价但是在一分钟之内你就可以写出数十亿字节的数据了这里是踢球者:对于没有实体或十亿的数据库,读写性能基本相同它像疯了一样扩展</p> <h4 id="api">API</h4> <p>The API to the Datatore feels <em>very</em> low-levelTherefore, for any serious Java app there is no way around <a href="https://github.com/objectify/objectify">Objectify</a>这是一个由Jeff Schnitzer编写的图书馆If Google has not done so already, they should write him a huge cheque for making the App Engine a better place他是为自己的事业而写的,但多年来不懈的奉献精神,他在论坛上提供的大量文档和支持令人震惊With Objectify, working with the Datastore is actually fun.</p> <p>Here is an example from the documentation:</p> <pre><code class="language-java">@Entity class Car { @Id String vin; String color; } ofy().save().entity(new Car("123123", "red")).now(); Car c = ofy().load().type(Car.class).id("123123").now(); ofy().delete().entity(c); </code></pre> <p>Objectify makes it really easy to declare entities as simple classes and then takes care of all the mapping between the Datastore.</p> <p>It also has a few tricks up its sleeve例如,它带有一级缓存这意味着每当您按键请求实体时,它首先会查看请求范围的缓存,无论该实体是否已被提取这有助于提高性能但是,它也可能令人困惑,因为当您获取实体并对其进行修改但<em>不</ em>保存它时,下一次读取将产生相同的缓存,修改后的对象This can lead to Heisenbugs.</p> <h4 id="developmenttesting">Development &amp; Testing</h4> <p>Since the App Engine is a proprietary cloud database, you cannot just start it locally在计算机上运行应用程序时,SDK会启动模拟数据存储它的行为非常接近生产环境只有表现要好得多,这可能会产生误导</p> <p>For running tests against the Datastore, the SDK is also able to start a local Datastore for you但是,这必须是不同的实现,因为它的行为与运行应用程序的行为不同当您意识到丢失的多索引在本地执行应用程序时会抛出错误而在测试同一查询时却不会出现错误多年来,我意外地发布了几个缺少索引的查询到生产中(通常仍然在Beta切换之后) - 尽管我对它进行了测试在与支持人员联系后,他们承认了疏忽,并承诺要解决这个问题 - 一年多后他们仍然没有</p> <h4 id="backups">Backups</h4> <p>Making backups of the Datastore is an atrocious process有手动和自动方式当然,当你有一个生产应用程序时,你想要定期备份The official way is a feature introduced in 2012 which is still in Alpha!</p> <p>By adding an entry to your <code>cron.xml</code> you can initiate the backup process该条目将包括要备份的实体的名称以及要将其保存到的Google Cloud Storage存储桶When the time has come, it will launch a few Python instances with the backup code, iterate through the Datastore and save them in some kind of proprietary backup format to your bucket有趣的是,存储桶可以包含多少文件的限制,因此您最好不时使用新存储桶</p> <p>This is the absolute worst thing about the Datastore.</p> <h3 id="memcache">Memcache</h3> <p>The other crucial way to store data on App Engine is Memcache默认情况下,您将获得<em>共享</ em> Memcache这意味着,它在尽力而为的基础上工作,并且无法保证它将具有多少容量There is also the dedicated Memcache for $0.06 per GB per hour.</p> <p>Objectify is able to use this as a second-level cache只需使用<code> @Cache </ code>注释一个实体,它将在数据存储区之前询问Memcache并首先保存每个实体这会对性能产生巨大影响通常Memcache会在大约5毫秒内响应,这比数据存储快得多我不知道我们可能有任何过时的缓存问题So this works very well in production.</p> <p>The benefits of it are actually very noticeable when Memcache is down这件事发生在我们身上一年一次,持续一两个小时Our site was barely usable, it was that slow.</p> <h3 id="bigquery">Big Query</h3> <p><a href="https://cloud.google.com/bigquery/">BigQuery</a> is a data warehouse as a service, managed by GoogleYou import data - which can be petabytes - and can run analyses via a custom query language.</p> <p>It integrates somewhat well with the Datastore since it allows to import Datastore backup files from Google Cloud Storage我已经使用了几次,遗憾的是并不总是成功对于<em>我们实体的一些</ em>,我收到了一个神秘的错误我无法弄清楚出了什么问题但是有些实体确实有效在稍微调整查询语言文档后,我能够生成我的第一个见解考虑到所有因素,这是一种运行简单分析的好方法如果不编写自定义代码,我肯定无法做到这一点但我并没有真正利用该服务的全部潜力All the queries I made could have been done in any SQL database directly, our data set was quite smallOnly because of the way the Datastore worked did I have to resort to the BigQuery service in the first place.</p> <h2 id="monitoring">Monitoring</h2> <p>The Google Cloud Console brings a lot of features to diagnose your app's behavior in productionJust look at the Google Cloud Console navigation:</p> <p><img src="http://www.djindien.com/content/images/2017/02/Screen-Shot-2017-02-27-at-19.58.46.png" alt="Google Cloud Console - Monitoring"></p> <p>This is the result of <a href="https://techcrunch.com/2014/05/07/google-acquires-cloud-monitoring-service-stackdriver/">Google's acquisition of Stackdriver</a> in 2014It still feels like a separate, standalone service - but its integration into Google Cloud Console is improving.</p> <p>Let's look at the capabilities one by one.</p> <h3 id="logging">Logging</h3> <p>It is crucial to access an application's logs quickly and with ease这在App Engine开始时真的很痛苦它曾经非常麻烦,因为它无法在应用程序的<em>所有</ em>版本中进行搜索这意味着当你在寻找某些东西时,你必须知道当时在线版本 - 或者逐个尝试几个版本它几乎无法使用Plus it was extremely slow.</p> <p>Since then, they have added useful filters to show only specific modules, versions, log levels, user agents or status codes它非常强大仍然不是<em>快</ em>,但与早期相比,它现在变得更好了Here is how it looks: </p> <p><img src="http://www.djindien.com/content/images/2017/02/Screen-Shot-2017-02-27-at-20.26.16.png" alt="Google Cloud Console - Logging"></p> <p>One very unique idea you can see here is that logs are always grouped by request在我遇到的所有其他工具中,例如Kibana,您将只获得与您的搜索匹配的日志行通过始终显示与您的搜索匹配的所有其他日志行,它为您提供更多上下文在调查日志中的问题时,我发现这非常有用,因为它可以立即帮助您更好地<em>了解</ em>发生的事情I truly miss that feature in every other log viewer I use.</p> <p>Another interesting trait of the App Engine is that each HTTP request is automatically assigned a request ID它被添加到传入的HTTP请求中并唯一地标识它这可以方便地将请求与其日志相关联例如,我们在发生未捕获的异常时发送电子邮件并包含请求ID - 这使查找日志变得微不足道The same can be done for frontend error tracking.</p> <h3 id="metrics">Metrics</h3> <p>The Cloud Console gives access to a few basic application metrics这包括请求量和延迟,流量,内存使用情况,实例数和错误计数It is useful as a starting point when investigating an issue and when you want to get a quick first impression of the general state of the app.</p> <p>Here is an example with the app's request volume:</p> <p><img src="http://www.djindien.com/content/images/2017/01/Screen-Shot-2017-01-22-at-15.44.06.png" alt="Google Cloud Console - Graphs"></p> <h3 id="tracing">Tracing</h3> <p>Since the App Engine instance is a black box, you cannot use other tools to diagnose its performance如果日志记录控制台不够,<em> Trace </ em>页面将提供更详细的数据It allows to search for the latency distribution of certain requests.</p> <p><img src="http://www.djindien.com/content/images/2017/02/Screen-Shot-2017-02-20-at-22.31.44.png" alt="Google Cloud Console - Trace"></p> <p>When you select a specific request, it opens up a timeline它显示您在日志中看不到的远程过程调用(RPC)另外,还有每个RPC类型的摘要通过单击RPC,可以获得更多详细信息,例如the response size, are shown.</p> <p>This can be extremely helpful to find the cause of a slow requestIn the following example you can see that the request makes a few fast Memcache calls and a very slow Datastore write operation.</p> <p><img src="http://www.djindien.com/content/images/2017/02/Screen-Shot-2017-02-20-at-22.32.49.png" alt="Google Cloud Console - Analysis"></p> <p>The only problem is that the RPCs do not include enough information to figure out what happened exactlyFor instance, the detail view of the Datastore write operation looks like this:</p> <p><img src="http://www.djindien.com/content/images/2017/02/Screen-Shot-2017-02-21-at-08.14.16.png" alt="Google Cloud Console - Analysis Detail View"></p> <p>It does not even include the name of the updated entity这是一个巨大的烦恼,可以使整个屏幕几乎无用只有一件事可以提供帮助:点击右上角的“显示日志”按钮它将包括与RPC交错的请求<em>内联</ em>的日志语句This way you <em>might</em> be able to infer more details from the context.</p> <h2 id="resources">Resources</h2> <p>It is also important to point out that pricing is completely usage-based这意味着您的应用程序的成本几乎逐字节,逐小时和按操作操作这也意味着,入门非常实惠没有固定成本If hardly anyone uses your app - since there is a free quota - you do not pay anything.</p> <p>The biggest item on the bill will most certainly be for the instances, contributing about 80% in my last projectThe next big chunk is likely the Datastore read/write cost, 15% of the total cost for us.</p> <p>There is a nice interface in the Google Cloud Console to keep track of all quotas:</p> <p><img src="http://www.djindien.com/content/images/2017/03/Screen-Shot-2017-03-13-at-11.33.20.png" alt="Google Cloud Console - Quotas"></p> <p>To be more specific, when I say 'all quotas' I mean all quotas Google tells you about实际上,我们遇到了<em>隐形</ em>配额的问题但是,当API可能已经在测试版中时,我<em>认为</ em>无论如何,我们的应用程序的一部分停止工作,我们不知道为什么幸运的是,我们订阅了<a href="https://cloud.google.com/support/"> Google Cloud支持</a>They informed us about said quota and we had to rewrite a part of our application to make it work again.</p> <p>We also had one minor outage due to the confusing pricing setup有一次,我们的一个应用程序突然停止工作,只是回复了默认的错误页面我们花了十分钟才发现我们达到了我们设定的预算限额After we raised it, everything just started working again.</p> <h2 id="support">Support</h2> <p>There is a lot to be said about Google Cloud Support首先,没有它我们会偶尔遇到严重的麻烦因此,拥有它是任何关键任务应用程序的必需品 - 在我看来例如,大约每年一次,我们的应用程序将停止提供请求我们没有做任何事情导致这一点联系Google支持后,我们会了解到他们已将我们的应用程序移至“不同群集”它刚刚再次运作这是一个非常可怕的情况You cannot do anything but 'pray to the Google gods'.</p> <p>Second of all, it is a hit or miss based on the support person质量差异很大有时我们需要交换十几条消息,直到他们最终理解我们为止像任何支持一样,它可能令人愤怒But in the end, they would usually resolve our issue or at least give us enough information to help us resolve it ourselves.</p> <h2 id="anewage">A New Age</h2> <p>Google is working on a new type of App Engine, the <a href="https://cloud.google.com/appengine/docs/flexible/">flexible environment</a>它目前处于测试阶段它的目标是提供两个世界中最好的:App Engine上运行的简便性和舒适性以及Google Compute Engine的灵活性和强大功能It allows to use any programming platform (like Java 9!) on any of the powerful Google Compute Engine machines (like 416GB RAM!) while letting Google take care of maintaining the servers and ensuring the app is running fine.</p> <p>They have been working on this for some years already当然,我们热衷于尝试So far, <a href="https://tech.small-improvements.com/2016/09/12/running-our-app-engine-application-in-the-flexible-environment-java-8/">we weren't that thrilled</a>But let's see where Google is taking this.</p> <h2 id="designforscale">Design for Scale</h2> <p>Now, you can look at the restrictions the App Engine imposes on your app as annoyances但请忍受我一会儿App Engine由Google创建这些人知道如何构建可扩展的系统这些限制仅仅是必要的它们迫使您调整应用程序以适应云的方式这是一件好事,应该被接受如果您觉得自己正在与App Engine作战,那么您就是在反对云的“新”规则This is certainly one lesson I'm taking away from three years on Google App Engine.</p> <p>Some restrictions and annoyances are the result of neglect by Google, though感觉他们只是投入了最低限度实际上,过去两年我有这种感觉与古老的技术堆栈合作是令人沮丧的,没有任何改善的希望如果存在已知问题但是它们没有修复则令人愤怒收到关于平台前进方向的信息很少令人沮丧You feel trapped.</p> <p>All in all, I liked how App Engine allowed the development team to focus on actually building an application, making users happy and earning money谷歌在运营工作中花了很多麻烦但是,“旧”App Engine即将问世我不认为再开始新项目是个好主意另一方面,如果App Engine Flexible Environment可以实际修复其前任的主要问题,它可能会成为开发应用程序的一个非常有趣的平台。</ p>