Unleashing Creativity: The Industrial Revolution Behind Cloud Gaming

22 min readNov 13, 2020

INTRODUCTION: From OnLive to Stadia, the Myths of Cloud Gaming

At GDC 2009, OnLive came out of nowhere to amaze the crowd: a new era of gaming experiences had arrived. With plenty of spotlight, capital and over 100,000 in-applications, OnLive has earned the industry’s attention. It has even launched its own game box and controller, which is quite a bit like Steam now.

Initial investment in OnLive came from Warner Bros, Autodesk and Maverick Capital, which later brought in AT&T Media Holdings, Inc. Belgacom, etc. But OnLive lasted only 5 years before the service was discontinued, and was acquired by Sony to become what is now known as Playstation Now.

After OnLive became history, successor players in the cloud gaming space are emerging. There are dozens of cloud gaming to C platforms worldwide: start-up companies such as Blade Shadow, Jump, Rainway, Vortex; game hosts/PC platform providers such as Microsoft, Sony, Valve; and cloud computing vendors such as Google, Amazon; game makers such as EA and Tencent; hardware manufacturers such as Nvidia…

The biggest current development is Stadia, which goes live in November 2019, and Google has taken advantage of its cloud computing to create this web-only service platform. Gamers can use Google Chrome to instantly experience any game wherever there’s an internet connection, whether it’s a phone, tablet or computer, and even click and join a game while watching a YouTube video.

According to Sensor Tower, Stadia garnered 175,000 app downloads the month it was released, and by April of this year, it had already surpassed one million downloads worldwide. But it remains an interesting question as to whether Stadia will work out the way Google envisioned. It’s also a question that all cloud gaming platforms need to answer, such as the delivery experience, game content, billing models, and so on.

Cloud games involve quite a lot of industries, and each field will have a variety of participants, who will define the shape of cloud games and explore the implementation path of cloud games in different ways based on their own advantages and understanding of cloud games.

In fact, there are a number of issues that almost always arise in the development and exploration of cloud games:

What should the cloud game look like?
How are cloud games different from current games?
Why is a “streaming” cloud gaming platform not a real thing?
What is the logic and core motivation behind the development of cloud gaming?
Why the emergence of cloud gaming will revolutionize the industry?
…

To answer these questions, this paper begins with the development of cloud gaming, identifies the core logic behind the technology-driven birth of new demand, and, based on this, discusses the different implications acting on two levels:

Data transmission methods: support faster and more stable digital content interaction.
Content production process: developing richer and more diverse content at lower cost and in less time.

We also discuss the structural features of the cloud game in the context of cloud technology and emphasize the importance of computing power in each part and how to achieve the cycle of cloud development, cloud testing, cloud operations, and cloud updates.

I. The development logic of the cloud game: technology drives the birth of new demand

1. three elements of game content interaction (3V)

Broadly speaking, when players experience game content, there are three levels of elements that affect their interaction experience: Velocity, Volume, and Variety.

The speed of content interaction refers to the amount of time delay between when an action occurs and when the player and the game content feels feedback from the game. The faster the speed of interaction, the smoother the player’s ongoing gaming experience. In the case of locally running games, this can be understood as the data throughput speed supported by the terminal hardware and corresponding software system, while in the case of networked games, there is also the data exchange rate between the local server and the cloud server.
The volume of content interaction refers to the volume of data and information that the player can interact with the game. To be clear, just because a game has a lot of content does not mean that the player can interact with a lot of content. It is the amount of content that can be interacted with that will affect the player’s ongoing experience in the game.
Diversity of content interaction refers to the player and game content interaction. In fact, volume and diversity of content describe the nature of the two dimensions; volume will be more related to the overall size of the content, while diversity will be more related to the degree to which the content is changeable.

For gamers, a smoother experience with game content requires not only optimization at the software and algorithm levels, but also the use of hardware to speed up information processing, as well as software better network systems to reduce latency, etc.

At the same time, the volume of content in the game itself will directly affect the length of the player’s experience. In addition to the quality of the graphics, the variety of game content will affect the utility of the player in the entire experience (immersion, satisfaction, achievement, etc.).

Following this logic, players may make payments in three categories:

Speed: hardware, software, communication data, etc.
Volume: duration, Copy (3A games will be more expensive), DLC, Battle pass, etc.
Diversity: IAP (skins, props, etc.), etc.

As communications technology becomes the underlying infrastructure of digital networks, the way players pay is increasingly tied to the content itself. Whether it’s size or diversity, if they can increase the content and quality of both, players will pay more for it.

2. technology driven vs business model driven

For an object that can be called a video game, no matter what its content looks like, delivering its contents to players in a stable and speedy manner and with timely reactions and changes corresponding to the player’s actions, is the fundamental requirement. Thus, we can obtain an important logic regarding communication and data transfer:

The development of cloud gaming came first from technological breakthroughs in the electronics & communications industry, which provided technical support for product engagement on the demand side. Two levels of problems were solved, one being server computing processing efficiency and the other being network transmission speed.

(1) At the server computing level, in terms of computing processing power and operating costs:

The server that is already built can support more computing and calculations.
Future new servers will be able to achieve higher computing performance at lower costs.

(2) At the network transmission level, it will be more in terms of video coding and decoding technology standards and network transmission protocol standards:

Video encoding and decoding technology standards can be understood as the transmission of multiple types of data over multiple channels. 4G has led to the creation of video applications such as Tiktok, Kwai, and the live streaming industry, but these applications are still generally only on the mobile side. 5G applications that rely on the new infrastructure use smaller video codecs for the same transmission channel in order to meet the demand for higher-definition real-time interactions, with a corresponding increase in communication capacity.
The Network Transport Protocol standard, on the other hand, is based on coding and decoding to enable more efficient data exchange, including speed and accuracy.

To achieve these effects, we need to first achieve technological breakthroughs in the upstream and midstream of the industry: sensors & chips in the semiconductor industry (currently facing the gradual invalidation of the traditional Moore’s theorem, thus shifting to the exploration of the dividends of the quantum Moore’s theorem), through network frameworks in the communications industry (e.g. SDN/NFV for 5G, essentially still innovating on the collaborative relationship between hardware and software, becoming progressively fragmented) and transmission efficiency (frequency band selection, large-scale antennas, exploration of the physical & chemical properties of the conduction medium) ,etc.

At the same time, another logic for the development of cloud games is that user demand is stimulated at the downstream end, which attracts capital investment and, in turn, drives upstream and midstream development.

For example, for many enterprise services:

In the early stages of industry development business forms and digital forms were not mature, and at first there were no specific applications that were integrated. While people had more freedom to adapt to different business forms, they needed to configure their own networks, servers, operating systems or storage, and the ease of local deployment was not as high as cloud deployment, as well as the problem of idle utilization, which led to the emergence of IaaS cloud services.
The development of business and market development gradually formed a corresponding business form, different scenarios in which people need to deal with an increasing amount of data, the demand for storage and interaction efficiency is increasing, the demand for personalized applications is rising, and the cost of self-built facilities and the cost of idle getting higher, thus giving rise to the birth of PaaS.
Later, the gradual maturation of industrial development has given rise to many similar business scenarios, in order to further improve the utilization of response, reduce the cost of duplication and idle, thus appearing in the form of SaaS products and applications.

If you look at cloud gaming from a demand perspective, the fact is that user demand for “true cloud gaming” has yet to emerge. In other words, “there is no game that can only be played on the cloud, because the experience of such game content will be limited by local servers and terminals, only cloud computing and cloud servers can achieve such a differentiated experience or effect”. With downstream demand, capital and resources are pooled to increase productivity in the mid and upstream to meet the corresponding demand at scale.

In fact, we think the focus of cloud gaming should be on “games,” and the definition of “cloud” is: all-round in the game’s mechanics, gameplay, settings and other planning aspects, development aspects such as development testing and operational aspects such as monetization are all reconsidered and implemented based on the characteristics of the cloud, taking full advantage of the cloud while trying to avoid the disadvantages of traditional processes.

That is, based on the characteristics of the cloud we can:

Optimize and transform production processes in the existing chain to create richer game content faster;
Rapid iterations of different content based on user preferences in the interaction between users and cloud games;

This is really about the volume and diversity of content in the “interaction triad”.

Regarding the volume of content: In the existing development process, all kinds of games face the same problem: under the premise of ensuring quality, the speed of production of content can not keep up with the speed of consumption. For example, the 3A game Wilderness Dart 2 took 8 years to develop with a team of more than 1,000 people and cost more than 5 billion RMB to make, so if you wish to make more DLC and other content, more time and cost would need to be invested. This is especially true for online games, handheld games, etc. What a team spends 2–3 months updating out, a player might finish in 2–3 days.
Regarding the variety of contents: the fact is that if a game’s content is merely voluminous, it doesn’t sustainably retain players. For example, many open-world games have a lot of content, but each NPC, quest, item collection, etc. is so repetitive that players won’t be able to consistently experience interesting content or even be motivated to explore more of the game. Of course, complete diversity may not be achievable, and the complete absence of diversity is at the other extreme. From a user and developer perspective, everyone wants variety in their games, but most games are doing Trade-offs due to technical, human resource, and financial constraints.

For both of these segments, the cloud can demonstrate very strong computing power to support the mass production of diverse content through high-speed data transfer and the scalability and elasticity of cloud computing. But having the computing power to support it is still not enough, and other ways to use that computing power to automate the production of content are needed, and that’s the part where AI excels.

Whether it’s at the graphical or logical level, bringing AI into the traditional production process allows for faster creation of richer content, and the introduction of cloud technology not only amplifies the capabilities of AI but also moves the production process to the cloud, thus giving the game the ability to iterate in real-time during the operational phase.

In this case, in order to create a “true cloud game”, it circles back on to the technology-driven nature.

By technology-driven, we are not referring to the current “streaming” approach, but rather: by optimizing and transforming the production process, we can create games with rich and constantly updated content in a faster and more efficient way, generating different game contents for different players in real-time.

Based on the discussion above, such a product will stimulate new user needs, gather new traffic, and gradually create new channels. At the same time, cloud-native games developed in the cloud will also interact with the cloud in real-time during the operation; based on players and traffic, the specific delivery, operation, and monetization of the game will change accordingly.

3. New interactions, new scenarios, new opportunities.

For current game content, the efficiency of production limits the efficiency of product updates and iterations, thus further constrained by the channels and corresponding user groups. Channels are divided into two categories, one is terminal hardware represented by PC, mobile phones, consoles, etc., and the other is terminal software represented by the Steam Store, Epic Store, Taptap, and various app stores.

For the terminal hardware, with the development of cloud technology, the gaming experience is gradually being unrestricted by the hardware, so that interaction can take place anywhere there is a “screen”. In other words, with the support of the Internet of Things and material technologies, the hardware channel in the future will not be just an electronic device in the traditional sense, but any object in our surroundings.

As for terminal software, game platforms (Steam, Epic, etc.), app stores (Apple Store, Google Play Store, Myapp, etc.), advertising platforms (Google Ads, Tencent Ads, Pangleglobal, etc.) and various user pools (communities, streaming media, etc.) form a combination of “outside the game” monetization.

Due to its cloud-native attribute, the cloud game is completed in the cloud from game development to online operation and can be based on different modules of user interaction, the “in-game” user tag can be acquired in a structured way, and then generate different game content in real-time according to the user’s interaction reaction.

The game content of different cloud game platforms will attract different types of users, and when users interact with the game, they will generate a lot of structured scenes, on the basis of which a corresponding recommendation mechanism can be introduced to connect the “characters — items — scenes” within the game to form a new consumption logic.

In addition, the combination of communication technology, cloud technology and AI technology will redefine the boundary of “game”. At the same time, the interactive experience of cloud games, when combined with other scenarios, will further meet new needs and create new value. For example, the native combination of cloud games and live broadcast will create new scenarios for users to interact with webcasters and game content:

Google Stadia’s Crowd Choice allows viewers to vote on the progress of a video game streamer during a game. Specific story points that can be voted on can be built into the game by the designer and used at the streamer’s discretion. For example, what weapons and equipment to use, how to kill monsters, how to decide the conversation with NPCs, and so on. Reinforcement learning-based AI and cloud technology will truly unlock the possibilities of in-game interactions to satisfy deeper interactions between players, streamers and games.

In addition, Genvid viewers can interact with the live feed they’re watching, as well as customize the angle of the in-game camera to watch the game live the way they like, or to put it in a different way, experience the game from a different angle.

At the same time, because all the graphics of the cloud game are generated and rendered in the cloud, the image and actions of specific characters can also be changed and controlled in real time. According to the user’s tag, we can recommend his or her favorite celebrity or image, and interact with the player to form a new consumption scene, etc.

For the cloud game itself, when there is a breakthrough in communication technology, cloud technology and AI technology, as its content is deeply supported and engaged by technology, the traditional consumption scenario and logic will change, thus bringing new structural opportunities.

II. The structural characteristics of the cloud game: modularization and integration

1. Computational layer: centralization of computational power

The real-time generation of game content relies on computing power, and huge game content requires the computing power of a server farm to complete. By putting computing power in the cloud, we can support the real-time generation of high-volume content with cloud computing power clusters.

Real-time generation of logical content, involving decision generation, dialogue generation, etc. It requires expending computational power for pre-training and iterating in real time as it is used. The real-time generation of image content, on the other hand, involves graphics generation, rendering, etc. This part will involve visual effects generation and visual logic generation.

In visual effects generation, graphics are generated by setting up a 3D model or 2D graphics right during the game development process, and then using corresponding textures, materials, etc. that are set up in advance to achieve the base image in a certain frame. At the same time, with such a base image, there are two levels of effects involved when the object is moving. One is the physical effect and the other is the animated effect. The physical effect determines the rules for the movement of the image, and the animated effect will represent these. The visual logic generation, on the other hand, will contain the logic for generating graphics, the logic for generating maps, and the logic for generating particles.

In addition to the above two, there is also real-time generation of sound content, including dialogue sound generation, scene sound generation, etc.; and real-time generation of kinesthetic content, involving tactile generation, etc.

Storing and computing data on the local client consumes CPU and GPU computing power, while data synchronization consumes only a fraction of that. The current way for users to increase local computing power is to buy and update hardware. At the same time, due to the difference in hardware, games are divided into console games, mobile games, PC games, etc., and thus the user base is also split.

The content of the current game and the hardware are highly coupled, and the presentation of the content is limited by the computing power of the hardware. In other words, the effect of the content that a single terminal’s hardware can show is limited. Whether it is the image level, the logic level or the gameplay level, all will be greatly limited.

For now, the technology can solve the problems of network transmission speed and cloud computing power, meaning: if there are games based on cloud computing power, they can have more sophisticated graphics as well as more complex gameplay, game elements and game systems.

On the graphics side, there are breakthroughs from NVIDIA, AMD, and others, but on the logic side, cloud computing can be used to create a game with more complex “gameplay, game elements, and game systems” that can generate different game content in real time based on different user interactions during the operational phase. Such a game will surely become a milestone in the history of cloud gaming.

Therefore, the problem goes back to the game itself, that is, how to develop games with “more complex game play, game elements and game system”, gather a large number of game players, so as to attract more developers to use such technology development, and further enrich the content of the cloud game and industrial ecology.

2、Development layer: modular development

To be clear, the current game engine is an “image oriented” engine. Whether it’s modeling, animation or special effects, the first purpose is to allow developers & artists to let players directly feel the game effects through images; at the same time, it takes into account the features of a development tool, such as network transmission, data synchronization and other systems; finally, it involves the implementation of game mechanics and gameplay, which can be achieved by coding.

Both Unity and Unreal engine don’t seem to give much thought to game mechanics and gameplay, and only provide ways to edit code to implement effects such as action skills, map generation, Game UI, etc., which indirectly achieve the effects the game planners want.

The methods used to create images and visual effects do not really need to be modularized, because we cannot say that there is a box module or a sun module. Almost all of them are processed by modeling, rendering and special effects, so they cannot be modularized. We can only use the most basic rules to achieve this, such as: providing triangle modeling and calculation, providing physical collision and calculation, providing light reflection and calculation, and so on.

From Unreal Engine 5, next-gen real-time demo running on PlayStation 5

But from the perspective of game planning, how can we more intuitively reflect the desired “game mechanics, gameplay, and game elements”, a modular development method is very necessary. In other words, this is a “logic-oriented” engine or cloud service. For specific mechanisms, gameplay and elements that can be modularized, you can refer to our previous research on game rules, mechanisms and gameplay, and we won’t discuss them in depth here.

So why can these modules be developed in the cloud? In fact, from the results, these modules can be run using the local server, but the question is, how to achieve the effects of these modules? For example, in order to realize a narrative-oriented intelligent NPC and advance the game plot with the player, the role setting must be done during game development, and then trained in the cloud through AI. This in itself is a process that requires computing power. At the same time, intelligently generating tasks based on the interaction between the player and the game content also requires the assistance of AI. Otherwise, due to human and resource constraints, the richness and diversity of content interaction cannot be achieved.

There are many such examples. Whether it is NPC’s decision-making response, task system, or checkpoint system, if traditional production processes are used and real teams are used to produce, the input cost will almost increase exponentially. When faced with the uncertainty of post-launch earnings, it is impossible to invest in production without restrictions. If you want to update the content corresponding to the iteration in time (or even real-time) after launch, it also requires a lot of manpower.

Players’ demand for content is unlimited. In order to meet the players’ demand for content volume, the development cost is increased indefinitely, which is obviously not a matter of scale effect.

The player’s experience of the game content, in addition to the visual content, is the content of the mechanism, gameplay, plot and other levels. The visual content will attract players at the beginning, but the attraction to players will gradually decrease in the follow-up. And the content of mechanism, gameplay, plot, etc., these attractiveness to players is gradually rising.

In general, gameplay based on battle requires new visual effects to continue to attract players; gameplay based on narrative requires new plots to continue to attract players; gameplay based on stages requires new stages to continue to attract players…

At the same time, we can also see that compared to the attraction of players with visual effects, the content of mechanisms, gameplay, plot and other levels will attract players more lasting; but whether it is the initial development or the update of the operation stage, it will cost a lot. Manpower. In addition, if the overall visual effect of the game is roughly determined, most of the updates and iterations in the operation phase will be in this aspect.

Therefore, if you want to continue to attract new players and retain old players through this content, there are two problems:

Players consume content too quickly
Players do not like the corresponding content

The first problem is essentially the issue of production efficiency. Real-time updates can be achieved for an infinite number of people, but it is actually impossible. Therefore, there will be an efficiency boundary. After this efficiency is exceeded, the marginal benefit is offset, and finally Becomes negative. The second problem is essentially the match between content and players. Solving this problem requires three methods to work simultaneously:

Unlimited content output
Adjust the content in time according to player feedback
Provide different content for different players

Obviously, the traditional development process cannot satisfy these three points at the same time. Therefore, the introduction of AI can solve these three problems at the same time, and as long as AI is used, the use of cloud computing power is an inevitable choice to improve production efficiency.

Therefore, modular development is to improve the production efficiency of “mechanism, gameplay, plot and other levels of content” and the efficiency of matching content and players. In order to achieve this effect, the assistance of AI and the support of cloud computing power must be required.

3. Development layer: integrated control

In order for different modules to better coordinate and cooperate with each other, it is necessary to have one or more centers for centralized control, so as to avoid directly controlling all “production systems” through the underlying rules.

For example, there are two systems in special effects software: one is node-based, such as Fusion and Nuke. The control process is very intuitive, and the node tree is clear at a glance. It is convenient for users to find, modify and classify each material and combination, so that we can make adjustments and modifications more clearly. The other is based on hierarchy, such as After Effect. Its principle is to divide all specific effects into front and back layers for synthesis. But when you actually use it, you will find that there will be a correlation or influence between various materials, and it will be very troublesome to adjust or add a certain effect.

When using the 3D software Houdini, its node tree allows designers to replace nodes anywhere without changing the overall effect. For example, it allows the replacement of models to ensure that the effect is still applied to the corresponding model. For this, 3ds Max and Maya are not as effective as Houdini.

The purpose of this example is that integrated or nodal operations can make the relationship between different modules very clear, and at the same time, different modules can be adjusted arbitrarily without affecting the normal operation of other modules.

Only by realizing such control can the corresponding game content be adjusted in real time according to the player’s interaction in the operation phase, so as to realize the game smoking with thousands of people and more effectively attract players for a long time.

4. Operation layer: update real-time

We said a lot before, so we won’t mention it here. Simply put, there are two levels:

After the game is launched, based on the modular content representation, the content of a single or multiple modules can be automatically produced
Automatically update and iterate different content modules according to the interaction between players and the game, and improve the attraction efficiency of different players

Every process needs the support of cloud computing power.

In summary, from the perspective of game development and operation, the true characteristics of cloud games are: cloud development, cloud testing, cloud operations, and cloud updates. These points are also a cycle, existing in the process of original development and iterative development at the same time. Coupled with the data feedback from the intelligent delivery at the channel level, the entire cycle of intelligent content generation & intelligent player matching will improve the overall commercialization efficiency, and further transform it into technological precipitation, and continue to promote the development and advancement of society.

III. From cloud gaming to virtual world: computing power releases creativity

In fact, not only does the development of cloud games require a large amount of computing power, but other fields also need to combine computing power and related technologies to further increase digital information. So that people can quickly and efficiently create and experience virtual worlds in various scenarios.

With the development of information technology, we continue to use sensors and information coding technology to digitize real-world information through image, audio, video and other information carriers, and we also begin to create native information in the virtual world.

With the assistance of big data, cloud, 5G, AI and other technologies, our real world is becoming increasingly connected with the virtual world. In this process, information transmission channels and corresponding tools are needed to improve the efficiency of information production in the virtual world. Taking into account the interaction methods of different scenarios, any method that can increase the efficiency of information transmission will generate value.

New objects, new scenes and new applications allow us to connect and perceive this information in new ways, which is accompanied by more and more diversified computing scenarios. In addition, users’ pursuit of application experience continues to increase, and the required computing power is increasing, which places new requirements on computing hardware.

Practice has proved that computing in one or two dimensions of pure cloud, edge, and end cannot better meet user needs. Only integrated collaborative computing can meet the diverse needs of different users for delay, performance, and power consumption; at the same time, Various chips and hardware in different scenarios will also provide as much computing power support as possible in a collaborative manner, thereby further improving the application of computing power in different scenarios.

According to the virtual world development logic we discussed earlier, the development of technology can enable AI to help people realize the digitalization of creativity in a more efficient way in limited areas and conditions, thereby improving the efficiency of the entire industry.

Our development focus will gradually shift from “digital processing of the real world” to “creating a native virtual world” and realize an “intelligent body” capable of autonomous decision-making, thereby creating more and more information in the native virtual world.

With the improvement of hardware computing power, cloud computing power and algorithm optimization, we believe that AI will perform better and better, and show amazing creativity in more fields. At the same time, we are also more convinced that the development of technology will further enhance the development and utilization of computing power, so that humans and AI will collaborate in a more in-depth manner and jointly release unprecedented creativity in the virtual world.

About rct

rct was founded in 2018, a member of Y Combinator W19, and is comprised of talents across AI, design and business. The team is passionate about using AI to create next generation interactive entertainment experiences. Our mission is to help human beings know more about themselves. So far, rct is backed by YC, Sky Saga Capital, and Makers Fund.

See our official website：https://rct-studio.com/en-us/