Unreal Engine 4 Optimization Tutorial, Part 2

February 21, 2017, 11:41 am

Latest and popular articles on Intel Technologies

≫ Next: Unreal Engine 4 Optimization Tutorial, Part 3

≪ Previous: Intel(R) Distribution for Python* 2017 Update 2

This is part 2 of a tutorial to help developers improve the performance of their games in Unreal Engine* 4 (UE4). In this tutorial, we go over a collection of tools to use within and outside of the engine, as well some best practices for the editor, and scripting to help increase the frame rate and stability of a project.

Editor Optimizations

Editor Optimizations

Forward versus. Deferred Rendered

Deferred is the standard rendering method used by UE4. While it typically looks the best, there are important performance implications to understand, especially for VR games and lower end hardware. Switching to Forward Rendering may be beneficial in these cases.

For more detail on the effect of using Forward Rendering, see the Epic documentation.

If we look at the Reflection scene from Epic’s Marketplace, we can see some of the visual differences between Deferred and Forward Rendering.

Figure 13: Reflection scene with Deferred Rendering

Figure 14: Reflection scene with Forward Rendering

While Forward Rendering comes with a loss of visual fidelity from reflections, lighting, and shadows, the remainder of the scene remains visually unchanged and performance increase maybe worth the trade-off.

If we look at a frame capture of the scene using the Deferred Rendering in the Intel GPA Frame Analyzer tool, we see that the scene is running at 103.6 ms (9 fps) with a large duration of time being taken by lighting and reflections.

Figure 15: Capture of the Reflection scene using Deferred Rendering on Intel® HD Graphics 530

When we look at the Forward Rendering capture, we see that the scene’s runtime has improved from 103.6 to 44.0 ms, or 259 percent improvement, with most time taken up by the base pass and post processing; both of which can be optimized further.

Figure 16: Capture of the Reflection scene using Forward Rendering on Intel® HD Graphics 530

Level of Detail

Static Meshes within UE4 can have thousands, even hundreds of thousands of triangles in their mesh to show all the smallest details a 3D artist could want to put into their work. However, when a player is far away from that model they won’t see any of that detail, even though the engine is still rendering all those triangles. To solve this problem and optimize our game we can use Levels of Detail (LOD) to have that detail up close, while also showing a less intensive model at a distance.

LOD Generation

In a standard pipeline, LODs are created by the 3D modeler during the creation of that model. While this method allows for the most control over the final appearance, UE4 now includes a great tool for generating LODs.

Auto LOD Generation

To auto generate Static Mesh LODs, go into that model’s details tab. On the LOD Settings panel select the Number of LODs you would like to have.

Figure 17: Creating auto generated level of details.

Clicking Apply Changes signals the engine to generate the LODs and number them, with LOD0 as the original model. In the example below, we see that the LOD generation of 5 takes our Static Mesh from 568 triangles to 28— a huge optimization for the GPU.

Figure 18: Triangle and vertex count, and the screen size setting for each level of detail.

When we place our LOD mesh in scene we can see the mesh change the further away it is from the camera.

Figure 19: Visual demonstration of level of detail based on screen size.

LOD Materials

Another feature of LODs is that each one can have its own material, allowing us to further reduce the cost of our Static Mesh.

Figure 20: Material instances applied to each level of detail.

For example, the use of normal maps has become standard in the industry. However, in VR there is a problem; normal maps aren’t ideal up close as the player can see that it’s just a flat surface.

A way to solve this issue is with LODs. By having the LOD0 Static Mesh detailed to the point where bolts and screws are modeled on, the player gets a more immersive experience when examining it up close. Because all the details are modeled on, the cost of applying a normal map can be avoided on this level. When the player is further away from the mesh and it switches LODs, a normal map can then be swapped in while also reducing the detail on the model. As the player gets even further away and the mesh gets smaller, the normal map can again be removed, as it becomes too small to see.

Instanced Static Meshes

Every time anything is brought into the scene it corresponds to an additional draw call to the graphics hardware. When this is a static mesh in a level, it applies to every copy of that mesh. One way to optimize this, if the same static mesh is repeated several times in a level, is to instance the static meshes to reduce the amount of draw calls made.

For example, here we have two spheres of 200 octahedron meshes; one set in green, and the other in blue.

Figure 21: Sphere of static and instanced static meshes.

The green set of meshes are all standard static meshes, meaning that each has its own collection of draw calls.

Figure 22: Draw calls from 200 static mesh spheres in scene (Max 569).

The blue set of meshes are a single-instanced static mesh, meaning that they share a single collection of draw calls.

Figure 23: Draw calls from 200 instanced static mesh spheres in scene (Max 143).

Looking at the GPU Visualizer for both, the Base Pass duration for the green (static) sphere is 4.30 ms and the blue (instanced) sphere renders in 3.11 ms; a duration optimization of ~27 percent in this scene.

One thing to know about instanced static meshes is that if any part of the mesh is rendered, the whole of the collection is rendered. This wastes potential throughput if any part is drawn off camera. It’s recommended to keep a single set of instanced meshes in a smaller area; for example, a pile of stone or trash bags, a stack of boxes, and distant modular buildings.

Figure 24: Instanced Mesh Sphere still rendering when mostly out of sight.

Hierarchical Instanced Static Meshes

If collections of static meshes that have LODs are used, consider a Hierarchical Instanced Static Mesh.

Figure 25: Sphere of Hierarchical Instanced Meshes with Level Of Detail.

Like a standard instanced mesh, hierarchical instances reduce the number of draw calls made by the meshes, but the hierarchical instance also uses the LOD information of its meshes.

Figure 26: Up close to that sphere of Hierarchical Instanced Meshes with Level Of Detail.

Occlusion

In UE4, occlusion culling is a system where objects not visible to the player are not rendered. This helps to reduce the performance requirements of a game as you don’t have to draw every object in every level for every frame.

Figure 27: Spread of Octohedrons.

To see the occluded objects with their green bounding boxes, you can enter r.VisualizeOccludedPrimitives 1 (0 to turn off) into the console command of the editor.

Figure 28: Viewing the bounds of occluded meshes with r.VisualizeOccludedPrimitives 1

The controlling factor of whether or not a mesh is drawn is relative to its bounding box. Because of this, some drawn objects that may not be visible to the player, but the bounding box is visible to the camera.

Figure 29: Viewing bounds in the meshes details window.

If a mesh needs to be rendered before a player sees it, for additional streaming time or to let an idle animation render before being seen for example, the size of the bounding boxes can be increased under the Static Mesh Settings > Positive Bounds Extension and Negative Bounds Extension in the meshes settings window.

Figure 30: Setting the scale of the mesh’s bounds.

As the bounding box of complex meshes and shapes always extend to the edges of those meshes, creating white space will cause the mesh to be rendered more often. It is important to think about how mesh bounding boxes will affect the performance of the scene.

For a thought experiment on 3D model design and importing into UE4, let’s think about how a set piece, a colosseum-style arena, could be made.

Imagine we have a player standing in the center of our arena floor, looking around our massive colosseum, about to face down his opponents. When the player is rotating the camera around, the direction and angle of the camera will define what the game engine is rendering. Since this area is a set piece for our game it is highly detailed, but to save on draw calls we need to make it out of solid pieces. First, we are going to discard the idea of the arena being one solid piece. In this case, the number of triangles that have to be drawn equals the entire arena because it’s all drawn as a single object, in view or not. How can the model be improved to bring it into the game?

It depends. There are a few things that will affect our decision. First is how the slices can be cut, and second is how those slices will affect their bounding boxes for occlusion culling. For this example, let’s say the player is using a camera angle of 90 degrees, to make the visuals easier.

If we look at a pizza-style cut, we can create eight identical slices to be wheeled around a zero point to make our whole arena. While this method is simple, it is far from efficient for occlusion, as there are a lot of overlapping bounding boxes. If the player is standing in the center and looking around, their camera will always cross three or four bounds, resulting in half the arena being drawn most the time. In the worst case, with a player standing back to the inner wall and looking across the arena, all eight pieces will be rendered, granting no optimization.

Next, if we take the tic-tac-toe cut, we create nine slices. This method is not quite orthodox, but has the advantage that there are no overlapping bounding boxes. As with the pizza cut, a player standing in the center of the arena will always cross three or four bounds when standing in the middle of the arena. However, in the worst case of the player standing up against the inner wall, they will be rendering six of the nine pieces, giving an optimization over the pizza cut.

As a final example, let’s make an apple core cut (a single center piece and eight wall slices). This method is the most common approach to this thought experiment and, with little overlap, a good way to build out the model. When the player is standing in the center they will be crossing five or six bounds, but unlike the other two cuts, the worst case for this cut is also five or six pieces rendered out of nine.

Figure 31: Thought experiment showing how a large model can be cut up, and how that effects bounding boxes and their overlap.

Cascaded Shadow Maps

Dynamic Shadow Cascades bring a high level of detail to your game, but they can be expensive and require a powerful gaming PC to run without a loss of frame rate.

Fortunately, as the name suggests, these shadows are dynamically created every frame, so can be set in game to allow the player to optimize to their preferences.

Cost of Dynamic Shadow Cascades using Intel® HD Graphics 350

The level of Dynamic Shadow Cascades can be dynamically controlled in several ways:

Shadow quality settings under the Engine Scalability Settings
Editing the integer value of r.Shadow.CSM.MaxCascades under the BaseScalability.ini file (between 0 and 4) and then changing the sg.ShadowQuality (between 0 – 3 for Low, Medium, High, and Epic)
Adding an Execute Console Command node in a blueprint within your game where you manually set the value of r.Shadow.CSM.MaxCascades

Back to Part 1 Next Section

↧

Unreal Engine 4 Optimization Tutorial, Part 3

February 21, 2017, 11:43 am

Latest and popular articles on Intel Technologies

≫ Next: IoT Reference Implementation: Making of an Environment Monitor Solution

≪ Previous: Unreal Engine 4 Optimization Tutorial, Part 2

This is part 3 of a tutorial to help developers improve the performance of their games in Unreal Engine* 4 (UE4). In this tutorial, we go over a collection of tools to use within and outside of the engine, as well some best practices for the editor, and scripting to help increase the frame rate and stability of a project.

Scripting Optimizations

Scripting Optimizations

Disabling Fully Transparent Objects

Even fully transparent game objects consume rendering draw calls. To avoid these wasted calls, set the engine to stop rendering them.

To do this with Blueprints, UE4 needs multiple systems in place.

Material Parameter Collection

First, create a Material Parameter Collection (MPC). These assets store scalar and vector parameters that can be referenced by any material in the game, and can be used to modify those materials during play to allow for dynamic effects.

Create an MPC by selecting it under the Create Advanced Asset > Materials & Textures menu.

Figure 32: Creating a Material Parameter Collection.

Once in the MPC, default values for scalar and vector parameters can be created, named and set. For this optimization, we need a scalar parameter that we will call Opacity and we’ll use it to control the opacity of our material.

Figure 33: Setting a Scalar Parameter named Opacity.

Material

Next, we need a material to use the MPC. In that material, create a node called Collection Parameter. Through this node, select an MPC and which of its parameters will be used.

Figure 34: Getting the Collection Parameter node in a material.

Once the node set is created, drag off its return pin to use the value of that parameter.

Figure 35: Setting the Collection Parameter in a material.

The Blueprint Scripting Part

After creating the MPC and material we can set and get the values of the MPC through a blueprint. The values can be called and changed with the Get/Set Scalar Parameter Value and Get/Set Vector Parameter Value. Within those nodes, select the Collection (MPC) to use and a Parameter Name within that collection.

For this example, we set the Opacity scalar value to be the sine of the game time, to see values between 1 and -1.

Figure 36: Setting and getting a scalar parameter and using its value in a function.

To set whether the object is being rendered, we create a new function called Set Visible Opacity with an input of the MPC’s Opacity parameter value and a static mesh component, and a Boolean return for whether or not the object is visible.

From that we run a greater than near-zero check, 0.05 in this example. A check of 0 could work, but as zero is approached the player will no longer be able to see the object, so we can turn it off just before it gets to zero. This also helps provide a buffer in the case of floating point errors not setting the scalar parameter to exactly 0, making sure it is turned off if it’s set to 0.0001, for instance.

From there, run a branch where a True condition will Set Visibility of the object to be true, and a False condition to be set to false.

Figure 37: Set Visible Opacity function.

Tick, Culling, and Time

If blueprints within the scene use Event Tick, those scripts are being run even when those objects no longer appear on screen. Normally this is fine, but the fewer blueprints ticking every frame in a scene, the faster it runs.

Some examples of when to use this optimization are:

Things that do not need to happen when the player is not looking
Processes that run based on game time
Non-Player Characters (NPC) that do not need to do anything when the player is not around

As a simple solution, we can add a Was Recently Rendered check to the beginning of our Event Tick. In this way, we do not have to worry about connecting on custom events and listeners to get our tick to turn on and off, and the system can still be independent of other actors within the scene.

Figure 38: Using the culling system to control the content of Event Tick.

Following that method, if we have a process that runs based on game time, say an emissive material on a button that dims and brightens every second, we use the method that we see below.

Figure 39: Emissive Value of material collection set to the absolute sine of time when it is rendered.

What we see in the figure is a check of game time that is passed through the absolute value of sine plus one, which gives a sine wave ranging from 1 to 2.

The advantage is that no matter when the player looks at this button, even if they spin in circles or stare, it always appears to be timed correctly to this curve thanks to the value being based on the sine of game time.

This also works well with modulo, though the graph looks a bit different.

This check can be called later into the Event Tick. If the actor has several important tasks that need to be done every frame, they can be executed before the render check. Any reduction in the number of nodes called on a tick within a blueprint is an improvement.

Figure 40: Using culling to control visual parts of a blueprint.

Another approach to limiting the cost of a blueprint is to slow it down and only let it tick once every time interval. This can be done using the Set Actor Tick Interval node so that the time needed is set through scripting.

Figure 41: Switching between tick intervals.

In addition, the Tick Interval can be set in the Details tab of the blueprint. This allows setting when the blueprint will tick based on time in seconds.

Figure 42: Finding the Tick Interval within the Details tab.

For example, this is useful in the counting of seconds.

Figure 43: Setting a second counting blueprint to only tick once every second.

As an example of how this optimization could help by reducing the average ms, let’s look at the following example.

Figure 44: An extremely useful example of something not to do.

Here we have a ForLoop node that counts 0 to 10000, and we set the integer Count to the current count of the ForLoop. This blueprint is extremely costly and inefficient, so much that it has our scene running at 53.49 ms.

Figure 45: Viewing the cost of the extremely useful example with Stat Unit.

If we go into the Profiler we see why. This simple yet costly blueprint takes 43 ms per tick.

Figure 46: Cost of extremely useful example ticking every frame as viewed in the Profiler.

However, if we only tick this blueprint once every second, it takes 0 ms most the time. If we look at the average time (click and drag over an area in the Graph View) over three tick cycles for the blueprint we see that it uses an average of 0.716 ms.

Figure 47: Cost average of the extremely useful example ticking only once every second as viewed in the Profiler.

To look at a more common example, if we have a blueprint that runs at 1.4 ms in a scene that is running at 60 fps, it uses 84 ms of processing time. However, if we can reduce its tick time, it reduces the total amount of processing time for the blueprint.

Mass Movement, ForLoops, and Multithreading

The idea of several meshes all moving at once looks awesome and can really sell the visual style of a game. However, the processing cost can put a huge strain on the CPU and, in turn, the FPS. Thanks to multithreading and UE4’s handling of worker threads, we can break up the handling of this mass movement across multiple blueprints to optimize performance.

For this section, we will use the following blueprint scripts to dynamically move a collection of 1600 instanced sphere meshes up and down along a modified sine curve.

Here is a simple construction script to build out the grid. Simply add an Instanced Static Mesh component to an actor, choose the mesh to use for it in the Details tab, and then add these nodes to its construction.

Figure 48: Construction Script to build a simple grid.

Once the grid is created, add this blueprint script to the Event Graph.

Something to note about the Update Instance Transform node. When the transform of any instance is modified, the change will not be seen unless Mark Render State Dirty is marked as true. However, it is an expensive operation, as it goes through every mesh in the instance and marks it as dirty. To save on processing, especially if the node is to run multiple times in a single tick, update the meshes at the end of that blueprint. In the script below we Mark Render State Dirty as true only if we are on the Last Index of the ForLoop, if Index is equal to Grid Size minus one.

Figure 49: Blueprint for dynamic movement for an instanced static mesh.

With our actor blueprint and the grid creation construction and dynamic movement event we can place several different variants with the goal of always having 1600 meshes displaying at once.

Figure 50: Diagram of the grid of 1600 broken up into different variations.

When we run the scene we get to see the pieces of our grid traveling up and down.

Figure 51: Instanced static mesh grid of 1600 moving dynamically.

However, the breakdown of the pieces we have affects the speed at which our scene runs.

Looking at the chart above, we see that 1600 pieces of one Instanced Static Mesh each (negating the purpose of even using instancing) and the single piece of 1600 run the slowest, while the rest all hover around a performance of 19 and 20 ms.

The reason the individual pieces runs the slowest is that the cost of running the 1600 blueprints is 16.86 ms, an average of only 0.0105 ms per blueprint. However, while the cost of each blueprint is tiny, the sheer number of them starts to slow down the system. The only thing that can be done to optimize is to reduce the number of blueprints running per tick. The other slowdown comes from the increased number of draw calls and mesh transform commands caused by the large number of individual meshes.

On the opposite side of the graph we see the next biggest offender, the single piece of 1600 meshes. This mesh is very efficient on draw calls, since the whole grid is only one draw call, but the cost of running the blueprint that must update all 1600 meshes per tick causes it to take 19.63 ms of time to process.

When looking at the processing time for the other three sets we see the benefits of breaking up these mass-movement actors, thanks to smaller script time and taking advantage of multithreading within the engine. Because UE4 takes advantage of multithreading, it spreads the blueprints across many worker threads, allowing the evaluation to run faster by effectively utilizing all CPU cores.

If we look at a simple breakdown of the processing time for the blueprints and how they are split among the worker threads, we see the following.

Data Structures

Using the correct type of Data Structure is imperative to any program, and this applies to game development just as much as any other software development. When programming in UE4 with blueprints, no data structures are given for the templated array that will act as the main container. They can be created them by hand using functions and the nodes provided by UE4.

Example of Usage

As an example of why and how a data structure could be used in game development, consider a shoot ’em up (Shmup) style game. One of the main mechanics of a Shmup is shooting thousands of bullets across the screen toward incoming enemies. While one could spawn each of the bullets and then destroy them, it would require a lot of garbage collection on the part of the engine, and could cause a slowdown or loss of frame rate. To get around this, developers could consider a spawning pool (collection of objects all placed into an array or list which are processed when the game is started) of bullets, enabling and disabling them as needed, so the engine only needs to create each bullet once.

A common method of using these spawning pools is to grab the first bullet in the array/list not enabled, moving it into a starting position, enabling it, and then disabling it when it flies off screen or into an enemy. The problem with this method comes from the run time, or Big O, of a script. Because you are iterating through the collection of objects looking for the next disabled object, if the collection is 5000 objects for example, it could take up to that many iterations to find one object. This type of function would have a time of O(n), where n is the number of objects in the collection.

While O(n) is far from the worst an algorithm can perform, the closer we can get to O(1), a fixed cost regardless of size, the more efficient our script and game will be. To do this with a spawning pool we use a data structure called a Queue. Like a queue in real life, this data structure takes the first object in the collection, uses it, and then removes it, continuing the line until every object has been de-queued from the front.

By using a queue for our spawning pool, we can get the front of our collection, enable it, and then pop it (remove it) from the collection and immediately push it (add it) to the back of our collection; creating an efficient cycle within our script and reducing its run time to O(1). We can also add an enabled check to this cycle. If the object that would be popped is enabled, the script would instead spawn a new object, enable it, and then push it to the back of the queue, increasing the size of the collection without decreasing the efficiency of the run time.

Queues

Below is a collection of pictures that illustrate how to implement a queue in blueprints, using functions to help maintain code cleanliness and reusability.

Pop

Figure 52: A queue pop with return implemented in blueprints.

Push

Figure 53: A queue push implemented in blueprints.

Empty

Figure 54: A queue empty implemented in blueprints.

Size

Figure 55: A queue size implemented in blueprints.

Front

Figure 56: A queue front implemented in blueprints.

Back

Figure 57: A queue back implemented in blueprints.

Insert

Figure 58: A queue insert with position check implemented in blueprints.

Swap

Figure 59: A queue swap with position checks implemented in blueprints.

Stacks

Below is a collection of pictures that illustrate how to implement a stack in blueprints, using functions to help maintain code cleanliness and reusability.

Pop

Figure 60: A stack pop with return implemented in blueprints.

Push

Figure 61: A stack push implemented in blueprints.

Empty

Figure 62: A stack empty implemented in blueprints.

Size

Figure 63: A stack size implemented in blueprints.

Back

Figure 64: A stack back implemented in blueprints.

Insert

Figure 65: A stack insert with position check implemented in blueprints.

Back to Part 2

↧

IoT Reference Implementation: Making of an Environment Monitor Solution

February 23, 2017, 8:48 am

Latest and popular articles on Intel Technologies

≫ Next: IoT Reference Implementation: How to Build an Environment Monitor Solution

≪ Previous: Unreal Engine 4 Optimization Tutorial, Part 3

To demonstrate the value of Internet of Things (IoT) code samples provided on Intel‘s Developer Zone as the basis for more complex solutions, this project extends previous work on air quality sensors. This IoT reference implementation builds on the existing Intel® IoT air quality sensor code samples to create a more comprehensive Environment-Monitor solution. Developed using Intel® System Studio IoT Edition and an Intel® Next Unit of Computing (Intel® NUC) as a gateway connected to sensors using Arduino* technology, the solution is based on Ubuntu*. Data is transferred to the cloud using Amazon Web Services (AWS)*.

Visit GitHub for this project’s latest code samples and documentation.

IoT promises to deliver data from previously obscure sources, as the basis for intelligence that can deliver business value and human insight, ultimately improving the quality of human life. Air quality is an excellent example of vital information that is all around us but often unseen, with importance that ranges from our near-term comfort to the ultimate well-being of our species, in terms of pollution and its long-term effects on our climate and environment.

In the context that IT infrastructure has historically been seen as having a negative environmental footprint, IoT can therefore be seen potentially as a vital inflection point. That is, as we strive for greater understanding of the world and our impact on it, technology is being developed to give us information that can drive wise decisions.

For example, IoT sensors gather data about levels of carbon dioxide, methane, and other greenhouse gases, toxins such as carbon monoxide, hexane, and benzene, as well as particulate irritants that include dust and smoke. They can also be used to capture granular information about the effects of these contaminants over vast or confined areas, based on readings such as temperature, light levels, and barometric pressure.

Building on a Previous Air-Quality Sensor Application

Part of the Intel® Developer Zone value proposition for IoT comes from the establishment of seed projects that provide points of entry into a large number of IoT solution domains. These are intended both as instructive recipes that interested developers can recreate and as points of departure for novel solutions.

The Existing Air Quality Sensor Application

The Environment Monitor described in this development narrative is an adaptation of a previous project—a more modest in scope air-quality sensor implementation. The previous air-quality sensor solution was built as part of a series of how-to Intel® IoT code sample exercises using the Intel® IoT Developer Kit.

Further information on that application can be found in the GitHub repository: https://github.com/intel-iot-devkit/how-to-code-samples/blob/master/air-quality-sensor/javascript/README.md

Its core functionality of the application is centered on a single air-quality sensor and includes the following:

Continuously checking air quality for airborne contaminants, based on whether any of several gases exceeds a defined threshold.
Alerting with an audible alarm whenever an alert is generated based on one of the threshold values being exceeded, which indicates unhealthy air.
Storing alert history in the cloud, tracking and providing an historical record of each time an alert is generated by the application.

The application code is built as a Node.js* project using JavaScript*, with MRAA (I/O library) and UPM (sensor library) from the Intel® IoT Developer Kit. This software interfaces with the following hardware components:

Intel® NUC
Arduino 101*
Grove* Starter Kit

Note: Here, we use an Intel® NUC but click here for other compatible gateways.

You can also use the Grove* IoT Commercial Developer Kit for this solution. The IDE used in creating the application is the Intel® System Studio IoT Edition.

The air-quality sensor application provides a foundation for the Environment Monitor.

Adapting the Existing Project to Create a New Solution

This document shows how such projects such as the previous air-quality sensor can readily be modified and expanded. In turn, the new Environment Monitor solution, shown in Figure 1, extends the IoT capability by adding additional sensors and provides the user flexibility in the choice of operating system (i.e., you can use Intel® IoT Gateway Software Suite or Ubuntu* Server).

Environmental Monitor

`Environmental Monitor
Figure 1. Enhanced Environmental Monitor Solution

Figure 2. Sensor Icons

Table 1. Sensor Icon Legend


Dust	Gas	Temperature	Humidity

The relatively simple scope of the existing air-quality sensor application allows significant opportunity for expanding on its capabilities.

The team elected to add a number of new sensors to expand on the existing solution’s functionality. The Environment Monitor solution includes new sensors for dust, temperature, and humidity.

In addition to the changes made as part of this project initiative, interested parties could make changes to other parts of the solution as needed. These could include, for example, adding or removing sensors or actuators, switching to a different integrated development environment (IDE), programming language, or OS, or adding entirely new software components or applications to create novel functionality.

Hardware Components of the Environment Monitor Solution

Intel® NUC
Arduino 101*
Grove* sensors

Based on the Intel® Atom™ processor E3815, the Intel® NUC offers a fanless thermal solution, 4 GB of onboard flash storage (and SATA connectivity for additional storage), as well as a wide range of I/O ports. The Intel® NUC is conceived as a highly compact and customizable device that provides capabilities at the scale of a desktop PC. It provides the following key benefits:

Robust compute resources to ensure smooth performance without bogging down during operation.
Ready commercial availability to help ensure that the project could proceed on schedule.
Pre-validation for the OS used by the solution (Ubuntu).

The Arduino 101* board makes the Intel® NUC both hardware and pin compatible with Arduino shields, in keeping with the open-source ideals of the project team. While Bluetooth* is not used in the current iteration of the solution, the hardware does have that functionality, which the team is considering for future use.

The Intel® NUC and Arduino 101 board are pictured in Figure 3, and the specifications of each are given in Table 2.

Intel NUC Kit
Figure 3. Intel® NUC Kit DE3815TYKHE and Arduino* 101 board.

Table 2. Prototype hardware used in the Environment Monitor solution

	Intel® NUC Kit DE3815TYKHE	Arduino 101 Board
Processor/Microcontroller	Intel® Atom™ processor E3815 (512K Cache, 1.46 GHz)	Intel® Curie™ Compute Module @ 32 MHz
Memory	8 GB DDR3L-1066 SODIMM (max)	196 KB flash memory 24 KB SRAM
Networking / IO	Integrated 10/100/1000 LAN	14 Digital I/O pins 6 Analog IO pins
Dimensions	190 mm x 116 mm x 40 mm	68.6 mm x 53.4 mm
Full Specs	specs	specs

For the sensors and other components needed in the creation of the prototype, the team chose the Grove* Starter Kit for Arduino*(manufactured by Seeed Studio*), which is based on the Grove* Starter Kit Plus used in the Grove* IoT Commercial Developer Kit. This collection of components is available at low cost, and because it is a pre-selected set of parts, it reduces the effort required to identify and procure the bill of materials. The list of components, their functions, and connectivity are given in Table 3.

Table 3. Bill of Materials

	Component	Details	Pin Connection	Connection Type
Base System	Intel® NUC Kit DE3815TYKHE	Gateway
	Arduino* 101 board	Sensor hub		USB
	Grove* - Base Shield	Arduino 101 Shield		Shield
	USB Type A to Type B Cable	Connect Arduino 101 board to Intel® NUC
Sensors	Grove - Green LED	LED indicates status of the monitor	D2	Digital
	Grove - Gas Sensor(MQ2)	Gas sensor (CO, methane, smoke, etc.)	A1	Analog
	Grove - Dust Sensor	Particulate matter sensor	D4	Digital
	Grove - Temp&Humi&Barometer Sensor (BME280)	Temperature, humidity, barometer sensor	Bus 0	I2C

Using that bill of materials and connectivity schema, the team assembled the bench model of the Environment monitor, as illustrated in the figures below.

Intel NUC Kit
Figure 4. Intel® NUC, Arduino* 101 board, and sensor components.

Figure 5. Pin Connections to the Arduino* 101 Board

Figure 6. Sensor callouts.

Software Components of the Environment Monitor Solution

Apart from the physical model, the Environment monitor solution also includes a variety of software components, which are described in this section. As mentioned above, the solution includes an administrative application running on the Intel® NUC, as well as a mobile customer application for general users, which is designed to run on a tablet PC or smartphone.

The previous air quality sensor application runs on Intel® IoT Gateway Software Suite. In contrast, the Environment monitor solution uses Ubuntu* Server.

The IDE used to develop the software for the Environment monitor solution is Intel® System Studio IoT Edition that facilitates connecting to the Intel® NUC and developing applications.

Like the previous air-quality sensor application, this solution uses MRAA and UPM from the Intel® IoT Developer Kit to interface with platform I/O and sensor data. The MRAA library provides an abstraction layer for the hardware to enable direct access to I/O on the Intel® NUC, as well as Firmata*, which allows for programmatic interaction with the Arduino* development environment, taking advantage of Arduino’s hardware-abstraction capabilities. Abstracting Firmata using MRAA enables greater programmatic control of I/O on the Intel® NUC, simplifying the process of gathering data from sensors. UPM is a library developed on top of MRAA that exposes a user-friendly API and provides the specific function calls used to access sensors.

Administrative Application

A simple administrative application built into the solution, the user interface of which is shown below runs on the Intel® NUC. This application provides a view of the data from the solution’s sensor array, with the ability to generate views of changes to that data over time. It also provides buttons that can be used by an administrator to trigger events by generating simulated sensor data that is outside preset “normal” ranges. This application is built to be extensible, with the potential to support additional types of sensors in multiple geographic locations, for example.

Figure 7 shows the main window of the administrative application, simulating normal operation when sensor data is all within the preset limits. This window includes controls to set the thresholds for alerts based on the various sensors, with display of the thresholds themselves and current sensor readings.

Admin Window
Figure 7. Main administrative window showing status of sensors within normal parameters.

In Figure 8, one of the buttons to generate a simulated event has been pressed. The active state of that button is indicated visually, the data reading is shown to be outside normal parameters, and the alert indicator is active. The operator can acknowledge the alert, dismissing the alarm while the application continues to register the non-normal data. Once the sensor data passes back into the normal range, the screen will return to its normal operating state.

Figure 8. Main administrative window during generation of a simulated alert.

The administrative application also provides the ability to view time-series sensor data, as illustrated in Figure 9. Using this functionality, operators can track changes over time in the data from a given sensor, which provides simple trending information that could be augmented using analytics.

Log File Screen History
Figure 9. Log-file screen showing historical sensor data.

The Environment monitor solution uses AWS* cloud services to provide a central repository for real-time and historical sensor data. This cloud-based storage could be used, for example, to aggregate information from multiple sensors as the basis for a view of contaminant levels and resultant effects over a large geographical area. Analytics run against either stored or real-time streaming data could potentially generate insights based on large-scale views of substances of interest in the atmosphere.

Using capabilities such as these, development organizations could establish a scope of monitoring that is potentially open-ended in terms of the factors under investigation and the geographic area under observation. The administrative application takes advantage of cloud-based resources in order to support large-scale analytics on big data, as the foundation for future innovation.

Note: The administrative application can access data on AWS* using either a backend datastore or data storage using Message Queue Telemetry Transport* (MQTT*), a machine-to-machine messaging server. Implementation guidance for those options is available at the following locations:

Cloud-based datastore:
MQTT:

Conclusion

The solution discussed in this development narrative provides an example of how existing IoT solutions provided by Intel can provide a springboard for development of related projects. In this case, a relatively simple air-quality sensor application is the basis of a more complex and robust Environment Monitor solution, which incorporates additional sensors and more powerful system hardware, while retaining the ability of the original to be quickly brought from idea to reality. Using this approach, IoT project teams working on the development of new solutions don’t need to start from scratch.

More Information

↧

IoT Reference Implementation: How to Build an Environment Monitor Solution

February 23, 2017, 4:46 pm

Latest and popular articles on Intel Technologies

≫ Next: intelvtune-amplifier-intel-advisor-and-intel-inspector-now-include-cross-os-support

≪ Previous: IoT Reference Implementation: Making of an Environment Monitor Solution

This guide demonstrates how existing IoT solutions can be adapted to address more complex problems (e.g., solutions that require more sensor monitoring). The solution we present here, an Environment Monitor, incorporates additional hardware and extends the use of IoT software libraries (sensors and I/O). Also, the solution has been adapted so the gateway can work with multiple operating systems.

Visit GitHub for this project’s latest code samples and documentation.

The Environment Monitor solution, shown in Figure 1, is built using an Intel® NUC Kit DE3815TYKHE, an Arduino 101* (branded Genuino 101* outside the U.S.) board, and Grove* sensors available from Seeed Studio*. The solution runs on Ubuntu* Server with the Intel® System Studio IoT Edition IDE which creates the code to enable the sensors.

Figure 1. Adapted Environment Monitor Solution

The Intel® NUC acts as a gateway for the solution. The Intel® NUC provides plenty of compute power to function as a router, run higher-level services such as a web server, and to interact with other cloud services (AWS, MQTT, etc.). However it does not provide any I/O ports for interfacing directly with sensors. Hence, the Arduino 101* acts an edge device/sensor hub. Firmata*, a communication protocol, is used to control the Arduino 101 from the application running on the Intel® NUC. In turn, the gateway can be programmed using Intel® System Studio IoT Edition from the host computer.

This solution is built around MRAA (I/O library) and UPM (sensor library) from the Intel® IoT Developer Kit to interface with platform I/O and sensor data. In this case, the MRAA library provides an abstraction layer for the hardware to enable direct access to I/O on the Arduino 101 board using Firmata*. The UPM sensor library was developed on top of MRAA and exposes a user-friendly API that will allow the user to capture sensor data with just a few lines of code. Data is then sent periodically to Amazon Web Services (AWS)* using MQTT*.

The exercise in this document describes how to build the Environment Monitor solution.

From this exercise, developers will learn how to:

Setup the system hardware
- Intel® NUC Kit DE3815TYKHE
- Arduino 101* board
- Sensors
Install and configure the required software
- Ubuntu* Server
- IoT Software Libraries (MRAA and UPM)
- Intel® System Studio IoT Edition (IDE)
Connect to Cloud Services
- Amazon Web Service (AWS)* using MQTT*

Setup the System Hardware

This section describes how to set up all the required hardware for the Environment Monitor solution: the Intel® NUC Kit DE3815TYKHE, the Arduino 101 board, and Grove* sensors.

Intel® NUC Setup

Figure 2. Intel® NUC with bootable USB

Figure 3. Back of the Intel® NUC

Setting up the Intel® NUC for this solution consists of the following steps:

Follow the Intel® NUC DE3815TYKHE User Guide (available online here) and determine if additional components, such as system memory, need to be installed. Optionally, an internal drive and/or wireless card can be added.
Connect a monitor via the HDMI or VGA port and a USB keyboard. These are required for OS deployment and can be removed after the Intel® NUC has been connected to the network and a connection from the development environment has been established.
Plug in an Ethernet cable from your network’s router. This step can be omitted if a wireless network card has been installed instead.
Plug in the power supply for the Intel® NUC but DO NOT press the power button yet. First connect the Arduino 101 and other hardware components and then power on the Intel® NUC.

Note: The Intel® NUC provides a limited amount of internal eMMC storage (4 GB). Consider using an internal drive or a USB thumb drive to extend the storage capacity.

Arduino 101 Setup

In general, the Arduino 101 board will be ready to use out of the box without any additional changes. A USB Type A to Type B cable is required to connect the Arduino 101 to the Intel® NUC.

Additional setup instructions for the Arduino 101 board are available at https://www.arduino.cc/en/Guide/Arduino101.

Sensors Setup

The sensors used for this project are listed in Table 1.

First, plug in the Grove* Base Shield on top of the Arduino 101 board.

Three sensors with various functions relevant to monitoring the environment have been selected:

Grove - Gas Sensor (MQ2) measures the concentration of several gases (CO, CH4, propane, butane, alcohol vapors, hydrogen, liquefied petroleum gas) and is connected to analog pin 1 (A0). Can detect hazardous levels of gas concentration.
Grove - Dust Sensor will detect fine and coarse particulate matter in the surrounding air, connect to digital pin 4 (D4).
Grove - Temperature, Humidity & Barometer sensor is based on the Bosch* BME280 chip and used to monitor temperature and humidity. It can be plugged in any of the connectors labeled I2C on the shield.

The Grove green LED acts as an indicator LED to show whether the application is running or not, and is connected to digital pin 2 (D2).

Table 1. Bill of Materials

	Component	Details	Pin Connection	Connection Type
Base System	Intel® NUC Kit DE3815TYKHE	Gateway
	Arduino* 101 board	Sensor hub		USB
	Grove* - Base Shield	Arduino 101 Shield		Shield
	USB Type A to Type B Cable	Connect Arduino 101 board to Intel® NUC
Sensors	Grove - Green LED	LED indicates status of the monitor	D2	Digital
	Grove - Gas Sensor(MQ2)	Gas sensor (CO, methane, smoke, etc.)	A1	Analog
	Grove - Dust Sensor	Particulate matter sensor	D4	Digital
	Grove - Temp&Humi&Barometer Sensor (BME280)	Temperature, humidity, barometer sensor	Bus 0	I2C

Note: here a green LED is used but any color LED (red, blue, etc.) can be used as an indicator.

Figure 4. Sensor connections to the Arduino 101

Figure 5. Sensors and pin connections

Install and Configure the Required Software

This section gives instructions for installation of the operating system and connecting the Intel® NUC to the Internet, installing required software libraries, and finally cloning the project sources from a GitHub* repository.

Installing the OS: Ubuntu Server

Installation instructions for Ubuntu on the Intel® NUC are available here: https://www.ubuntu.com/download/server/install-ubuntu-server.

Use the ISO file from the above link.
Instead of creating a bootable CD-ROM (as per the above link), create a bootable USB stick. The instructions to download the tool and create the bootable USB based on your OS are here:
When you've created a bootable USB stick/drive (i.e., Ubuntu operating system has been downloaded onto the USB stick/drive), insert it into a USB port on the back of the Intel® NUC and then power on the gateway.
- Choose to install Ubuntu from the menu.
- Select to install ssh server when prompted during the Ubuntu installation.

Note: Find additional information about drivers and troubleshooting: http://www.intel.com/content/www/us/en/support/boards-and-kits/000005499.html.

Connecting the Intel® NUC to the Internet

This section describes how to connect the Intel® NUC to your network, which will enable you to deploy and run the project from a different host on the same network (i.e. your laptop). Internet access is required in order to download the additional software libraries and the project code.

The following steps list commands to be entered into a terminal (shell) on the Intel® NUC. You can connect to the Internet through an Ethernet cable or Wi-Fi*.

Ethernet

After Ubuntu is installed, restart the Intel® NUC and login with your user ID.
Type in the command ifconfig and locate the interface named enp3s0 (or eth0). Use the interface name for the next step.
Open the network interface file using the command: vim /etc/network/interfaces and type:
auto enp3s0 iface enp3s0 inet dhcp
Save and exit the file and restart the network service using:
/etc/init.d/networking restart

Note: If you are connecting to external networks via a proxy, setting up a network connection is also required.

Wi-Fi (optional)

This is an optional step that only applies if a wireless card has been added to the Intel® NUC.

Install Network Manager using the command: sudo apt install network-manager and then install WPA supplicant using: sudo apt install wpasupplicant
Once these are in place, check your Wi-Fi* interface name using ifconfig. This examples uses wlp2s0. Now run these commands:
- Add the wifi interface to the interfaces file at: /etc/network/interfaces by adding the following lines:
  auto wlp2s0
  iface wlp2s0 inet dhcp
- Restart the networking service: /etc/init.d/networking restart
- Run: nmcli networking and nmcli n connectivity and nmcli radio, these commands tell you whether the network is actually enabled or not, in case either of them days not enabled then you’ll have to enable full connectivity. For enabling radio use the following command: nmcli radio wifi on
- Now check the connection status: nmcli device status
- If the Wi-Fi interface shows up as unmanaged then troubleshoot.
- To check for and add wifi connections: nmcli d wifi rescan
  nmcli d wifi
  nmcli c add type wifi con-name [network-name] ifname [interface-name] ssid [network-ssid]
- Running nmcli c should show you the connection you have tried to connect to. In case you are trying to connect to an enterprise network, you might have to make changes to /etc/NetworkManager/system-connections/[network-name]
- Now bring up the connection and the network interfaces:
  nmcli con up [network-name]
  ifdown wlp2s0
  ifup wlp2s0

You can now use the Intel® NUC remotely from your development machine if you are on the same network.

Installing the MRAA and UPM libraries

In order to put UPM and MRAA on your system you can just use the MRAA:PPA to update the libraries. The instructions are as follows:

sudo add-apt-repository ppa:mraa/mraa sudo apt-get update sudo apt-get install libupm-dev python-upm python3-upm upm-examples libmraa1 mraa-firmata-fw mraa-imraa

You can also build from source:

MRAA instructions: https://github.com/intel-iot-devkit/mraa/blob/master/docs/building.md

UPM instructions: https://github.com/intel-iot-devkit/upm/blob/master/docs/building.md

Note: You’ll need CMake if you plan to build from source.

Plug in an Arduino 101* board and reboot the Intel® NUC. The Firmata* sketch is flashed onto the Arduino 101, you are ready to use MRAA and UPM. If you have an error, run the command imraa –a. If you are missing dfu-util then install it after setting the MRAA PPA to get the dfu that is included with MRAA.

Cloning the Git* repository

Clone the reference implementation repository with Git* on your development computer using:

$ git clone https://github.com/intel-iot-devkit/reference-implementation.git

Alternatively, you can download the repository as a .zip file. To do so, from your web browser (make sure you are signed in to your GitHub account) click the Clone or download button on the far right (green button in Figure 6 below). Once the .zip file is downloaded, unzip it, and then use the files in the directory for this example.

Figure 6

Create the Development and Runtime Environment

This section gives instructions for setting up the rest of the computing environment needed to support the Environment Monitor solution, including installation of Intel® System Studio IoT Edition, creating a project, and populating it with the files needed to build the solution.

Install Intel® System Studio IoT Edition

Intel® System Studio IoT Edition allows you to connect to, update, and program IoT projects on the Intel® NUC.

Windows Installation

Note: Some files in the archive have extended paths. We recommend using 7-Zip*, which supports extended path names, to extract the installer files.

Install 7-Zip (Windows only):
- Download the 7-Zip software from http://www.7-zip.org/download.html.
- Right-click on the downloaded executable and select Run as administrator.
- Click Next and follow the instructions in the installation wizard to install the application.
Download the Intel® System Studio IoT Edition installer file for Windows.
Using 7-Zip, extract the installer file.

Note: Extract the installer file to a folder location that does not include any spaces in the path name.

For example, DO use C:\Documents\ISS. DO NOT include spaces such as: C:\My Documents\ISS.

Linux* Installation

Download the Intel® System Studio IoT Edition installer file for Linux*.
Open a new Terminal window.
Navigate to the directory that contains the installer file.
Enter the command: tar -jxvf file to extract the tar.bz2 file, where file is the name of the installer file. For example, ss-iot-linux.tar.bz2. The command to enter may vary slightly depending on the name of your installer file.

Mac OS X® Installation

Download the Intel® System Studio IoT Edition installer file for Mac OS X.
Open a new Terminal window.
Navigate to the directory that contains the installer file.
Enter the command: tar -jxvf file to extract the tar.bz2 file, where file is the name of the installer file. For example, tar -jxvf iss-iot-mac.tar.bz2. The command to enter may vary slightly depending on the name of your installer file.

Note: If you get a message "iss-iot-launcher can’t be opened because it is from an unidentified developer", right-click the file and select Open with. Select the Terminal app. In the dialog box that opens, click Open.

Launch Intel® System Studio IoT Edition

Navigate to the directory where you extracted the contents of the installer file.
Open Intel® System Studio IoT Edition:
- On Windows, double-click iss-iot-launcher.bat to launch Intel® System Studio IoT Edition.
- On Linux, run export SWT_GTK3=0 and then ./iss-iot-launcher.sh.
- On Mac OS X, run iss-iot-launcher.

Note: Using the iss-iot-launcher file (instead of the Intel® System Studio IoT Edition executable) will open Intel® System Studio IoT Edition with all the necessary environment settings. Use the iss-iot-launcher file to launch Intel® System Studio IoT Edition every time.

Add the Solution to Intel® System Studio IoT Edition

This section provides the steps to add the solution to Intel® System Studio IoT Edition, including creating a project and populating it with the files needed to build and run.

Open Intel® System Studio IoT Edition. When prompted, choose a workspace directory and click OK.
From the Intel® System Studio IoT Edition, select File | New | Create a new Intel Project for IoT. Then choose Intel® Gateway 64-Bit, as shown in Figure 7 and click next until you reach the Create or select the SSH target connection screen, as shown in Figure 8. Input IP address of the Intel® NUC (run command: ifconfig on the Intel® NUC if you don't know the IP address).

Figure 7. New Intel® IoT Project.

Figure 8. Adding Target Connection.
Now give the project the name “Environment Monitor” and in the examples choose the “Air Quality Sensor” as the How To Code Sample (shown in Figure 9) and then click Next.

Figure 9. Adding Project Name.
The preceding steps will have created a How to Code Sample project. Now we have to do a couple of small things in order to convert this into Environment Monitor Project:
- Copy over the air-quality-sensor.cpp and grovekit.hpp files from the git repository's src folder into the new project's src folder in Intel® System Studio IoT Edition. This will overwrite the local files.
- Next right click on the project name and follow the sequence Right Click → C/C++ Build → Settings → IoT WRS 64-Bit G++ Linker → Libraries and then add the libraries as shown in the following screen shot. This can be done by clicking on the small green '+' icon on the top right side of the libraries view. The red 'x' next to green '+' icon deletes the libraries.
Figure 10. Adding libraries to the build path
In order to run this project, connect to the Intel® NUC first using the IP address (already provided). This can be done from the Target Selection View tab, but you can also right click on the target (gateway device) and choose the “Connect” option. Enter username/password for the Intel® NUC when prompted.

Note: Ensure the Intel® NUC and the laptop (running Intel® System Studio IoT Edition) are connected to the same network.

Setup and Connect to a Cloud Service

Amazon Web Services (AWS)*

This solution was designed to send sensor data using the MQTT* protocol to AWS*. In order to connect the application to a cloud service, first setup and create an account.

To set up and create an account: https://github.com/intel-iot-devkit/intel-iot-examples-mqtt/blob/master/aws-mqtt.md

The following information should now be available:

MQTT_SERVER - use the host value you obtained by running the aws iot describe-endpoint command, along with the ssl:// (for C++) or mqtts:// protocol (for JavaScript*)
MQTT_CLIENTID - use \<your device name\>
MQTT_TOPIC - use devices/ \<your device name\>
MQTT_CERT - use the filename of the device certificate as described above
MQTT_KEY - use the filename of the device key as described above
MQTT_CA - use the filename of the CA certificate (/etc/ssl/certs/VeriSign_Class_3_Public_Primary_Certification_Authority_-_G5.pem)

Additional Setup for C++ Projects

When running your C++ code on the Intel® NUC, set the MQTT* client parameters in Eclipse* as outlined in the steps below: Go to Run configurations and in the commands to execute before application field, type:
chmod 755 /tmp/; export MQTT_SERVER="ssl://<Your host name>:8883"; export MQTT_CLIENTID="<Your device ID>"; export MQTT_CERT="/home/root/.ssh/cert.pem"; export MQTT_KEY="/home/root/.ssh/privateKey.pem"; export MQTT_CA="/etc/ssl/certs/VeriSign_Class_3_Public_Primary_Certification_Authority_-_G5.pem"; export MQTT_TOPIC="devices/<your device ID>"
Click Apply to save these settings.
Click Run.

Figure 11. Adding MQTT variables to a Run Configuration

More Information

↧

intelvtune-amplifier-intel-advisor-and-intel-inspector-now-include-cross-os-support

February 24, 2017, 7:24 am

Latest and popular articles on Intel Technologies

≫ Next: Drexel University

≪ Previous: IoT Reference Implementation: How to Build an Environment Monitor Solution

Redirect to Intel® VTune™ Amplifier, Intel® Advisor, and Intel® Inspector now include Cross-OS support

↧

Drexel University

February 24, 2017, 10:23 am

Latest and popular articles on Intel Technologies

≫ Next: Savannah College of Art and Design

≪ Previous: intelvtune-amplifier-intel-advisor-and-intel-inspector-now-include-cross-os-support

Sole

Drexel is proud to be invited to IUGS since the beginning. It’s a great opportunity for Drexel students to have their work recognized on a national stage. Holding it at the premier game development industry conference makes it a great practical exercise in promoting their projects, as well as a strong networking opportunity, regardless of the competition outcome.

Drexel’s Digital Media program produces many student projects every year, so when they open their internal competition, they get student teams applying from different years (sophomores through PhD candidates), courses, and programs under the DIGM umbrella. They hold their own competition using a format similar to the actual event, modeling their judging process on the IUGS rules, and adding the overall quality of the presentation to the gameplay and visual quality categories. With energetic discussions among the faculty, they choose from their 6-8 participating teams the one team that will represent Drexel.

It’s an exciting process that becomes a goal for the students, especially after Drexel’s first-place win for gameplay last year. Opportunities like IUGS inspire students to stay focused on their projects as goals beyond just grades and portfolios.

The Team:

Nabeel Ansari – Composer who is responsible for the game’s beautiful music and audio design
Nina DeLucia and Vincent De Tommaso – Artists who work tirelessly to paint, model, and sculpt all of Sole’s art assets
Thomas Sharpe – Creative director and programmer

The Inspiration:

The team says: “The creative process for conceptualizing ‘Sole’ has been one of the most challenging and rewarding journeys of our artistic careers. We believe video games are an incredibly powerful tool for capturing abstract emotions that are hard to put into words. So in approaching the original design of Sole, we started with a particular emotion and worked backwards to find what kinds of interactions would evoke that feeling. The game’s core mechanic and thematic content were inspired by the internal struggles we’re currently facing in trying to figure out who we are and where we’re going in our personal and creative lives. In many ways, Sole is an allegory for all that uncertainty we’re feeling working through our first major artistic endeavor.”

The Game:

Sole is an abstract, aesthetic-driven adventure game where you play as a tiny ball of light in a world shrouded in darkness. The game is a quiet journey through desolate environments where you’ll explore the remnants of great cities to uncover the history of an ancient civilization. Paint the land with light as you explore an abandoned world with a mysterious past.

Sole is a game about exploring a world without light. As you move through the environment and discover your surroundings, you’ll leave a permanent trail of light wherever you go. Free from combat or death, Sole invites players of all skill levels to explore at their own pace. With no explicit instructions, players are left to discover their objective as they soak up the somber ambiance.

Development and Hardware:

The team says: “Developing the game on Intel’s newest hardware has given us the opportunity to experiment with many visual effects we previously couldn’t achieve. As a result, we are now able to incorporate DirectX 11 shaders for our grass, add more props and details to the environment, and render the world with multiple post-processing effects. The feel of our game changed dramatically once we had access to hardware that was capable of powering the latest rendering technology.”

↧

Savannah College of Art and Design

February 24, 2017, 10:30 am

Latest and popular articles on Intel Technologies

≫ Next: SMU Guildhall

≪ Previous: Drexel University

Kyon

The Savannah College of Art and Design participated in the IUGS 2016 competition and found that it was a great experience for the students, giving them an opportunity to present their work in front of judges. This year, SCAD sent out a department-wide call for entries. Faculty members evaluated entries based on a balance of game design, aesthetics, and overall product polish. “Kyon” was chosen as the best among its peers after the students produced an interesting playable version in their first 10 weeks of development.

The Team:

Jonathan Gilboa – Art lead, character artist
Neal Krupa – Environment artist, level designer
Chris Miller – Tech lead, gameplay programmer
Jason Thomas – Prop artist, UI programmer
Remi Gardaz – Prop artist, character rigger
Erika Flusin – Environment artist, prop artist
Jack Lipoff – Project lead, lead designer

The Inspiration:

The team says: “We were actually in development of a totally different game last year, and while going over what types of ambient wildlife we wanted at one point, sheep were a popular option. A running joke began of having sheep play a more and more central role in the game. Eventually, we were tasked with developing a different game and we decided to just roll with what was originally just a joke and develop a sheep-herding video game.

The Game:

Kyon is a top-down third-person adventure game where the player assumes the role of a sheepdog named Kyon in mythological Ancient Greece. Kyon is sent by his master, Polyphemus, to find lost sheep and bring the herd home. The player must guide the herd with physical movement and special bark commands through dangerous environments filled with AI threats. All art assets are made using a PBR workflow, and the art team utilized advanced software for realistic effects such as Neofur and Speedtree. Level streaming allows an entire play through with no loading screens to interrupt gameplay.

Development and Hardware:

The team made use of Intel products in each of its machines, relying heavily on the power inside to push the boundaries of its sheep herd size and particle systems. The game was made entirely on machines using Intel technology.

↧

SMU Guildhall

February 24, 2017, 10:55 am

Latest and popular articles on Intel Technologies

≫ Next: Recipe: Building and Running GROMACS* on Intel® Processors

≪ Previous: Savannah College of Art and Design

Mouse Playhouse

SMU Guildhall was asked to participate in the inaugural University Games Showcase in 2014 and proudly participated with “Kraven Manor.” The 2014 event was a great experience for both the students and the university, resulting in an invitation they look forward to annually. The team's selection process is the same every year. There is a small panel of three that reviews the capstone games developed over the school year. The panel members are: Gary Brubaker – Director of Guildhall; Mark Nausha – Deputy Director Game Lab: and Steve Stringer – Capstone Faculty. This panel uses three very high but simple measures: 1) quality in game play and visuals; 2) does the game demonstrate the team game pillars of the program?; and; 3) are the students excellent ambassadors of their game and the university? Guildhall has quite a few games and students that exceed the panel’s expectations, making their job very difficult in choosing only one team.

The Team:

Clay Howell – Game designer
Taylor Bishop – Lead programmer
Ben Gibson – Programmer
Jeremy Hicks – Programmer
Komal Kanukuntla – Programmer
Michael Feffer – Lead level designer
Alexandre Foures – Level designer
Steve Kocher – Level designer
Jacob Lavender – Level designer
James Lee – Level designer
Sam Pate – Level designer
Taylor McCart – Lead artist
Devanshu Bishnoi – Artist
Nina Davis – Artist
Taylor Gallagher – Artist
Mace Mulleady – Artist
Mitchell Massey – Usability producer
Mario Rodriguez – Producer

The Inspiration:

The team says: “The idea for ‘Mouse Playhouse’ started out as a companion-based platformer for PC but slowly evolved into a VR puzzle game. The team rallied on the idea of being the first VR Capstone game and developing it for a new platform. Therefore, after the team decided to do a VR game, we studied what was fun about playing in VR by playing games already on the market. We found that throwing and being able to move objects around were fun mechanics to do in VR. This is why the main mechanic of the game revolves around moving and placing objects in specific locations to solve a puzzle. Additionally, we included throwing mechanics as extra mini-games such as shooting basketballs into a toy basketball hoop and throwing darts. We wanted to use simple but fun mechanics to showcase the skills of our developers to create a game for a new platform.”

The Game:

‘Mouse Playhouse’ is a light-hearted VR puzzle game in which you manipulate objects to solve puzzles and guide your pet mice towards the cheese. In Mouse Playhouse, you can also throw objects around, play basketball, darts, and even play the xylophone. There are a total of 15 levels in the game and each one presents a different challenge. Players must use the blue objects to guide the mice away from trouble and towards the cheese. During development, the level designers created clever solutions that enabled them to record mixed reality using Unreal Engine. During development, Unreal Engine did not have support for more than two Vive controllers and mixed reality recording. So the level designers used various tools such as the Unreal Sequencer to “fake” mixed reality in the engine. This allowed the team to record gameplay and live footage on a green screen for their trailer.

Development and Hardware:

The students used Intel® Core™ i7 processor-based desktops won at a previous Intel GDC Showcase event for development. With the addition of an NVidia* 1080 GPU, these machines provided a lag-free development environment. When the students did usability testing, a clear result was that high-performance computing was required for a comfortable VR experience.

↧

Recipe: Building and Running GROMACS* on Intel® Processors

February 24, 2017, 11:03 am

Latest and popular articles on Intel Technologies

≫ Next: University of Central Florida

≪ Previous: SMU Guildhall

Purpose

This recipe describes how to get, build, and run the GROMACS* code on Intel® Xeon® and Intel® Xeon Phi™ processors for better performance on a single node.

Introduction

GROMACS is a versatile package for performing molecular dynamics, using Newtonian equations of motion, for systems with hundreds to millions of particles. GROMACS is primarily designed for biochemical molecules like proteins, lipids, and nucleic acids that have a multitude of complicated bonded interactions. But, since GROMACS is extremely fast at calculating the non-bonded interactions typically dominating simulations, many researchers use it for research on non-biological systems, such as polymers.

GROMACS supports all the usual algorithms expected from a modern molecular dynamics implementation.

The GROMACS code is maintained by developers around the world. The code is available under the GNU General Public License from www.gromacs.org.

Code Access

Download GROMACS:

Get the GROMACS-2016.1 release. This code version includes optimization for better performance on the Intel® Xeon Phi™ processor: http://manual.gromacs.org/documentation/2016/download.html

Workloads Access

Download the workloads:

water1.5M_pme and water1.5M_rf: ftp://ftp.gromacs.org/pub/benchmarks/water_GMX50_bare.tar.gz
lignocellulose3M_rf: http://www.prace-i.eu/UEABS/GROMACS/1.2/GROMACS_TestCaseB.tar.gz

Generate Water Workloads Input Files:

To generate the .tpr input file:

tar xf water_GMX50_bare.tar.gz
cd water-cut1.0_GMX50_bare/1536
gmx_mpi grompp -f pme.mdp -c conf.gro -p topol.top -o topol_pme.tpr
gmx_mpi grompp -f rf.mdp -c conf.gro -p topol.top -o topol_rf.tpr

Build Directions

Build the GROMACS binary. Use cmake configuration for Intel® Compiler 2017.1.132 + Intel® MKL + Intel® MPI 2017.1.132:

Set the Intel Xeon Phi BIOS options to be:

Quadrant Cluster mode
MCDRAM Flat mode
Turbo Enabled

For Intel Xeon Phi, build the code as:

BuildDir= "${GromacsPath}/build” # Create the build directory
installDir="${GromacsPath}/install"
mkdir $BuildDir
source /opt/intel/<version>/bin/compilervars.sh intel64 # Source the Intel compiler, MKL and IMPI
source /opt/intel/impi/<version>/mpivars.sh
source /opt/intel/mkl/<version>/mklvars.sh intel64
cd $BuildDir # Set the build environments for Intel Xeon Phi

FLAGS="-xMIC-AVX512 -g -static-intel"; CFLAGS=$FLAGS CXXFLAGS=$FLAGS CC=mpiicc CXX=mpiicpc cmake .. -DBUILD_SHARED_LIBS=OFF -DGMX_FFT_LIBRARY=mkl -DCMAKE_INSTALL_PREFIX=$installDir -DGMX_MPI=ON -DGMX_OPENMP=ON -DGMX_CYCLE_SUBCOUNTERS=ON -DGMX_GPU=OFF -DGMX_BUILD_HELP=OFF -DGMX_HWLOC=OFF -DGMX_SIMD=AVX_512_KNL -DGMX_OPENMP_MAX_THREADS=256

For Intel Xeon, set the build environments and build the code as above with changes:

FLAGS="-xCORE-AVX2 -g -static-intel"
-DGMX_SIMD=AVX2_256

Build GROMACS:

make -j 4
sleep 5
make check

Run Directions

Run workloads on Intel Xeon Phi with the environment settings and command lines as (nodes.txt : localhost:272):


	export  I_MPI_DEBUG=5
	export I_MPI_FABRICS=shm
	export I_MPI_PIN_MODE=lib
	export KMP_AFFINITY=verbose,compact,1

	gmxBin="${installDir}/bin/gmx_mpi"

	mpiexec.hydra -genvall -machinefile ./nodes.txt -np 66 numactl -m 1 $gmxBin mdrun -npme 0 -notunepme -ntomp 4 -dlb yes -v -nsteps 4000 -resethway -noconfout -pin on -s ${WorkloadPath}water-cut1.0_GMX50_bare/1536/topol_pme.tpr
	export KMP_BLOCKTIME=0
	mpiexec.hydra -genvall -machinefile ./nodes.txt -np 66 numactl -m 1 $gmxBin mdrun -ntomp 4 -dlb yes -v -nsteps 1000 -resethway -noconfout -pin on -s ${WorkloadPath}lignocellulose-rf.BGQ.tpr
	mpiexec.hydra -genvall -machinefile ./nodes.txt -np 64 numactl -m 1 $gmxBin mdrun -ntomp 4 -dlb yes -v -nsteps 5000 -resethway -noconfout -pin on -s ${WorkloadPath}water-cut1.0_GMX50_bare/1536/topol_rf.tpr

Run workloads on Intel Xeon with the environment settings and command lines as:


	export  I_MPI_DEBUG=5
	export I_MPI_FABRICS=shm
	export I_MPI_PIN_MODE=lib
	export KMP_AFFINITY=verbose,compact,1

	gmxBin="${installDir}/bin/gmx_mpi"

	mpiexec.hydra -genvall -machinefile ./nodes.txt -np 72 $gmxBin mdrun -notunepme -ntomp 1 -dlb yes -v -nsteps 4000 -resethway -noconfout -s ${WorkloadPath}water-cut1.0_GMX50_bare/1536_bdw/topol_pme.tpr
	export KMP_BLOCKTIME=0
	mpiexec.hydra -genvall -machinefile ./nodes.txt -np 72 $gmxBin mdrun -ntomp 1 -dlb yes -v -nsteps 1000 -resethway -noconfout -s ${WorkloadPath}lignocellulose-rf.BGQ.tpr
	mpiexec.hydra -genvall -machinefile ./nodes.txt -np 72 $gmxBin mdrun -ntomp 1 -dlb yes -v -nsteps 5000 -resethway -noconfout -s ${WorkloadPath}water-cut1.0_GMX50_bare/1536_bdw/topol_rf.tpr

Performance Testing

Performance tests for GROMACS are illustrated below with comparisons between an Intel Xeon processor and an Intel Xeon Phi processor against three standard workloads: water1536k_pme, water1536k_rf, and lignocellulose3M_rf. In all cases, turbo mode is turned on.

Testing Platform Configurations

The following hardware was used for the above recipe and performance testing.

Processor	Intel® Xeon® Processor E5-2697 v4	Intel® Xeon Phi™ Processor 7250
Stepping	1 (B0)	1 (B0) Bin1
Sockets / TDP	2S / 290W	1S / 215W
Frequency / Cores / Threads	2.3 GHz / 36 / 72	1.4 GHz / 68 / 272
DDR4	8x16GB 2400 MHz(128GB)	6x16 GB 2400 MHz
MCDRAM	N/A	16 GB Flat
Cluster/Snoop Mode/Mem Mode	Home	Quadrant/flat
Turbo	On	On
BIOS	GRRFSDP1.86B.0271.R00.1510301446	GVPRCRB1.86B.0011.R04.1610130403
Compiler	ICC-2017.1.132	ICC-2017.1.132
Operating System	Red Hat Enterprise Linux* 7.2	Red Hat Enterprise Linux 7.2
Operating System	3.10.0-327.el7.x86_64	3.10.0-327.13.1.el7.xppsl_1.3.3.151.x86_64

GROMACS Build Configurations

The following configurations were used for the above recipe and performance testing.

GROMACS Version: GROMACS-2016.1
Intel® Compiler Version: 2017.1.132
Intel® MPI Library Version: 2017.1.132
Workloads used: water1536k_pme, water1536k_rf, and lignocellulose3M_rf

↧

University of Central Florida

February 24, 2017, 2:13 pm

Latest and popular articles on Intel Technologies

≫ Next: University of Utah

≪ Previous: Recipe: Building and Running GROMACS* on Intel® Processors

The Channeler

FIEA's decision to participate was an easy one. In their view, IUGS has turned into a great celebration and showcase of student games at GDC and they love competing with peer programs. “The Channeler” was a great game for them to pick because it has a mix of innovation, gameplay, and beautiful art. Also, because it uses eye-tracking (through a partnership with the Tobii Eye Tracker) as its main controller, they believe it will really stand out from the rest of the field.

The Team:

Summan Mirza – Project lead
Nihav Jain – Lead programmer
Derek Mattson – Lead designer
Alex Papanicolaou – Lead artist
Peter Napolitano – Technical designer
Raymond Ng – Technical designer/art manager
Claire Rice – Environment/UI artist
KC Brady – Environment artist/UX design
Matt Henson – Gameplay programmer
Steven Ignetti – Gameplay/AI programmer
Yu-Hsiang Lu – SDK programmer
Kishor Deshmukh – UI programmer

The Inspiration:

Summan Mirza says: “The original pitch started with the idea of innovating by creating gameplay that could not be replicated with traditional controllers. Eye-tracking, as a mechanic, had many fascinating possibilities that lent itself well to immersing a player in a world. After all, the vast majority of us already use our eyes heavily to play games! We had first looked into developing ‘The Channeler’ by installing eye-tracking into a VR headset, but found that modding one to use eye-tracking was too problematic and expensive at the time. Instead, we found the Tobii EyeX, a slim and affordable eye-tracking peripheral that did not carry the cumbersome and obtrusive nature of headset-style peripherals. From then on, we mass-prototyped in aspects such as narrative, combat, and puzzles, and surprisingly found the breadth of possibilities for eye-tracking-based gameplay vast and exciting, but sadly, beyond our scope. So, we focused on using eye-tracking for puzzle-based gameplay. The kooky City of Spirits in The Channeler formed the perfect wrapper for this. It allowed us to create some out-of-the-box puzzles that worked well in such a silly world, and the feeling of controlling the world with your eyes really gave players a sense of possessing an otherworldly power.

The Game:

“The Channeler” takes place in a kooky city of spirits, where the denizens are plagued by mysterious disappearances. Fortunately, you are a Channeler. Gifted with the “Third Eye,” you possess a supernatural ability to affect the world around you with merely your sight. Explore the spooky night market and solve innovative puzzles to find the missing spirits! Innovation is what really sets The Channeler apart from other games; not many games out there use eye-tracking as a main mechanic. Whether it’s trying to beat a seedy ghost in a shuffling shell game, tracing an ancient rune with your gaze, or confronting possessed statues that rush toward you with every blink—our game utilizes eye movement, blinking, and winking mechanics that provide only a sample of the vast possibilities for eye-tracking games.

Development and Hardware:

Summan Mirza says: “Game development brings up many challenges along the way, especially when troubleshooting hardware issues. However, using Intel hardware for ‘The Channeler’ was virtually effortless. We never had to worry about things like frame-rate issues, even though our game heavily used taxing graphical aspects such as overlapping transparencies with the ghosts. To put it simply, working with Intel hardware was certainly the smoothest part of the game development process.”

↧

University of Utah

February 24, 2017, 2:22 pm

Latest and popular articles on Intel Technologies

≫ Next: Rochester Institute of Technology

≪ Previous: University of Central Florida

Wrecked: Get Your Ship Together

In UU’s Entertainment Arts and Engineering program, all students working on their capstone projects and all masters student projects, are automatically entered into a university event where faculty not involved in the actual game projects review all entries. They select four finalists. A subcommittee of three faculty members chooses the finalist. This year, “Wrecked” was chosen.

The Team:

Matt Barnes – Lead artist
Jeff Jackman – Artist
Brock Richards – Lead technical artist
Sam Russell – Lead engineer
Samrat Pasila – Engineer
Yash Bangera – Engineer
Michael Brown – Engineer
Shreyas Darshan – Engineer
Sydnie Ritchie – Producer, team lead

The Inspiration:

Bob Kessler, Executive Director and founder of the School of Computing, says: “I was co-teaching the class when the games were originally created. We went through a long process that involved studying award-winning games, coming up with many different ideas, narrowing those down to a handful, paper prototyping, and then digital prototyping to try to find the fun. Sydnie's team had a goal to try to solve the problem that the VR experience is typically for one person at a time. For example, if you had a party, then only one person can experience the game. They decided that by integrating players using their mobile phones with the person using the VR headset, it would be a better experience for all.”

The Game:

“Wrecked: Get Your Ship Together” is the living-room party game for VR! One player is on the Vive while everyone else in the room plays on their mobile phones. Together, they must repair their ship to escape the planet on time. The Vive player must navigate a foreign planet, both on foot and by hover ship, to scavenge parts and repair the team’s mothership. The mobile players command helpful drones which follow the player and give aid. Specifically, the mobile players can give directional guidance, or they can obtain speed boosts for their captain by successfully executing orders.”

Another problem specific to VR is that of traveling world-scale environments in a room-scale experience, a living room is generally a bit smaller than the world of Skyrim. The development team’s solution is to give the player a hover ship. This means their actual physical chair is part of the play space. When they sit, they can fly around world-scale. When they stand up, they can experience the full joys of room-scale.

The development team feels both the mobile integration with VR and the physical augmentation of the game are compelling, and they are excited to be exploring this new space.

Development and Hardware:

The EAE studio has over 100 computers all with Intel hardware in them. Besides the wonderful and fast Intel processors, the team had donated SSDs for its computers. Having an SSD is great as it eliminates the latency for file and other access.

↧

Rochester Institute of Technology

February 24, 2017, 2:35 pm

Latest and popular articles on Intel Technologies

≫ Next: University of Southern California

≪ Previous: University of Utah

Gibraltar

MAGIC Spell Studios LLC at Rochester Institute of Technology (RIT) looks forward to the Intel University Games Showcase each year. This is an incredible opportunity to experience what the most talented students in nationally ranked programs (by the Princeton Review) are creating. The future of our industry is in great hands with the talented, passionate visionaries who showcase their work at this event. Choosing John Miller to represent RIT was an easy decision to make. We were first introduced to John last spring following his participation in the Imagine Cup finals. John has all of the skills necessary to succeed and be a leader in this industry. His work ethic, passion, and talent are impressive and he is a terrific example of the caliber of students that we can offer to game development companies.

The Team:

John Miller – Creator, designer, and developer
Danielle Carmi – 2D/3D artists and animators
Angela Muscariello – Sound designer
Elena Mihajlov – Music composer

The Inspiration:

John Miller says: “’Gibraltar’ is the game that I wanted to play while sitting in the back of class during high school. I wanted a game that I could play with one hand while still listening to the lecture, and that would hold my attention but wouldn’t make me feel trapped into a 45-minute session. I wanted to play something that was fast, simple, and low commitment while still strategically deep and intellectually stimulating. I also wanted to bring back that experience of sitting at a computer with a friend and playing old games like ‘Advance Wars,’ ‘Lego Star Wars,’ and ‘Backyard Football.’ The idea of Gibraltar bounced around in my head for a year and eventually I started putting those ideas down on paper and in my senior year of high school, I started development on the game. The art style and some mechanics have evolved along the way but the essence of that quick, lightweight strategy game is still there in Gibraltar today.”

The Game:

“Gibraltar” is a quick, turn-based strategy game in which two players send armies of adorable robots into battle for control of the game board. The more territory the player controls on the board, the more they can move their robots on their turn. This means that players are free to craft their own strategies and game play is very fluid. A match of Gibraltar usually lasts between 5 and 10 minutes, so you can fit a match in anywhere. Gibraltar is meant to be played between two players, sitting at the same screen. That kind of head-to-head competitive experience is something John always enjoyed. The player has four cute robot types to choose from when setting up their army, each with its own strengths and weaknesses. You only get to spawn your army once, so choose its composition carefully! The synergy between the different pieces allows unique gameplay and endless strategies and is simple enough for everyone to pick up. Players can also use special abilities that can change the course of the game but they are expendable and cost action points to play. Gibraltar features a story mode to help introduce players to the game and includes a fun cast of characters. The game ships a built-in map editor for players to design their own maps and play them with their friends.

↧

University of Southern California

February 24, 2017, 2:47 pm

Latest and popular articles on Intel Technologies

≫ Next: University of Wisconsin–Stout

≪ Previous: Rochester Institute of Technology

Arkology

The USC GamePipe Laboratory has participated in the Intel University Games Showcase since it was created.

The GamePipe Laboratory faculty looks at the games being built in its Advanced Games course and other games shown at its Showcase, and then they agree on which game is the best. That game is the one that goes to IUGS, and this year it is “Arkology.”

The Team:

Core members:

Powen Yao – Team lead responsible for overall design, research, and production
Thomas Kao – Engineer responsible for overall project, architecture, network
Joey Hou – Engineer responsible for overall project, gameplay
Tiffany Shen – Artist responsible for models, textures, effects
Grace Guo – Developer responsible for user interface, user experience
Qing Mao – Technical artist responsible for 3D art, tech art, effects
Eric Huang – Gameplay engineer

Newly-joined members:

Leo Sheng – Gameplay engineer
Jian Zhang – AI engineer
Subhayu Chakravorty – Gameplay engineer

Former members:

Divyanshu Bhardwaj – Gameplay engineer
Christine Gong – Gameplay engineer
Yue Wang – Gameplay engineer
Guangxi Jin – Gameplay engineer

The Inspiration:

Powen Yao says: “As a PhD student under Professor Michael Zyda, I've been studying artificial intelligence in games and recently in virtual reality interaction. Real-time strategy provides a great platform for research and demonstration of game AI. My recent foray into virtual reality interaction led to my belief that it has great potential that could be unlocked with a combination of novel interaction techniques supported by AI. My interests in the two areas led to ‘Arkology,’ a virtual reality real-time strategy game. Right now, there are many virtual reality games in the market featuring swordplay or gun fights, but we want to take the game in a different direction. Instead of simply using the player's controllers to represent a gun or a sword, we are examining ways for the player to better process information and to effectively interact with many game elements at once. In other words, to design a real-time strategy game specifically for virtual reality to provide players with a unique virtual reality experience.

The Game:

In “Arkology,” the player has been chosen as the commander of Ark, a massive space-faring arcology designed to preserve humanity's continued prosperity and survival. The player can control the game using simple and intuitive motion control. From the Operations Room in the heart of the Ark, the player must strategize, command, and lead his forces to preserve what may be the last of humanity. The game can be described as a real-time tabletop war game where players need to control their miniature game pieces to fight the opposing force. A player's goal is to achieve the mission objective ranging from defending a valuable target to annihilating the enemy force.

Thematically, we want our players to feel like military commanders making strategic decisions and seeing their plans come to life. We want the players to feel like generals in World War II movies drafting their invasion plans over the map of Europe. We want to let the players live the scenes from the movie “Ender's Game” where the commander's will and orders get carried out by his officers.

Our focus for this project is in exploring novel virtual reality interactions to best utilize the fact that players have access to a three-dimension world. We are developing a series of virtual gears that will help a player better command an army and survey the battlefield. Some examples of what we have or are working on:

Adaptable controller for the player to quickly change the functionality for the situation at hand.
Augmented vision goggle to let the player see or hide additional game stats and information.
A utility-belt for players to store and access game elements.
Customizable battlefield camera and screen for players to monitor the battlefield.

Development and Hardware:

Professor Zyda says, “Having hardware donated from Intel helps our program out enormously and gives our students access to hardware of the future!”

↧

University of Wisconsin–Stout

February 24, 2017, 3:52 pm

Latest and popular articles on Intel Technologies

≫ Next: A Python Script to Summarize MKL_VERBOSE Output

≪ Previous: University of Southern California

Everend

The University of Wisconsin (UW)–Stout has been sending students, faculty, and alumni to the GDC in San Francisco since 2011, through both grant-funded travel and class-based domestic study-abroad opportunities. In addition, UW–Stout has attended the IUGS and always left inspired to create games and someday be a participant, as well. UW–Stout is happy to be selected as a participant this year because of its consistent rankings in the Princeton Review. Everend is a game that represents everything that UW–Stout is about: bringing artists, designers, and programmers together to create imaginative virtual worlds, stories, and characters.

The Team

Will Brereton—Programming lead
Emily Dillhunt—Art lead
Mitch Clayton—Design lead
Daniel Craig—Programming and lighting
Megan Daniels—Art and public relations
Gabe Deyo—Programming and cameras
Phoenix Hendricks—Programming and sound
Alex Knutson—Rigging and animation
Logan Larson—Programming and level design
Zachary Pasterski—Artist and level design
Hue Vang—Cinematic and concept artist
Tyler Walvort—Programming generalist
Dave Beck—Professor and executive producer

The Inspiration

The team says, “The game idea was very loose at first and closer to a Japanese role-playing game in style. Kaia, then our unnamed purple owl protagonist, lived underground with a village of burrowing owls when their world started to shake and crumble. It was Kaia’s job to figure out why. During a pitch presentation, a professor who was sitting in remarked, ‘Why not have the character be out of place? Have the cave be a strange place,’ and that comment colored the rest of our development and sent us in the direction you see today. Environmentally, the game is inspired by the massive caves of Hang Sun Doong in Vietnam, where the caverns are large enough that forests grow within. We wanted to capture the ambient, colorful, vast feeling we got from looking at pictures of caves like these instead of the cramped, dark, wet feeling usually conjured up by cave environments.”

The Game

Players explore a vast, ancient cave and overcome its many obstacles as Kaia, an adolescent owl. Everend is a single-player, exploration-driven 3D puzzle platformer. Players solve various problems and puzzles throughout the environment as they progress through various levels of the caves, trying to reach the surface. Kaia collects various items throughout the journey, which allows for increasingly complex problems and solutions. The team wanted to focus on the atmosphere and ambience of the game environment, so they kept the UI and mechanics simple and noninvasive to place the focus on exploration. Players can also try to piece together the story of what happened in the caves as they explore by studying the cave environment and clues left behind in the form of cave paintings. Everend is a short, self-contained experience, with a character and environment the team hopes will leave a lasting impression.

Development and Hardware

The team says, “Our development team was lucky to have access to a dedicated game design computer lab in the School of Art & Design at the UW–Stout. The lab regularly updates their technology every two years with new workstations, so we had the opportunity to work with HP workstations that had game industry-standard Intel processors, complimented by high-end graphics cards and more than enough RAM. In addition, we were able to use these machines in the playtesting and exhibition of the games, making it a great overall experience that was all built on the foundation of Intel processor technology.”

↧

A Python Script to Summarize MKL_VERBOSE Output

February 27, 2017, 2:32 pm

Latest and popular articles on Intel Technologies

≫ Next: Why threading matters for Ashes of the Singularity*

≪ Previous: University of Wisconsin–Stout

Intel® Math Kernel Library (Intel® MKL) provides a mechanism to capture the run time information of BLAS and LAPACK domain functions used in a target application. This mechanism is enabled by either setting the environment variable MKL_VERBOSE to 1 or calling the support function, mkl_verbose(1). When a user application is run under this configuration mode, Intel MKL outputs the function name, parameters and time taken to execute the corresponding function. Below is the MKL_VERBOSE output produced by a sample program invoking MKL BLAS functions:

Refer to the related chapter of Intel MKL documentation for more details about MKL_VERBOSE: https://software.intel.com/en-us/node/528417

MKL_VERBOSE can produce large amounts of output (in the order of hundreds of MB’s or even tens of GB’s), when running applications that heavily use MKL. And it can be difficult to understand the usage of MKL functions in the application. To alleviate this problem, we provide a Python script that summarizes the output produced by verbose mode and generates a report grouped by function names, parameter list, call count and execution time. This tool takes an MKL_VERBOSE log as input and produces a summary report at stdout. Below is an example output produced by the tool:

The tools is provided as an attachment to this article. Users can find the download link at the bottom of the page. After download, rename the file to have the .py suffix.

↧

Why threading matters for Ashes of the Singularity*

February 27, 2017, 3:33 pm

Latest and popular articles on Intel Technologies

≫ Next: Drexel University

≪ Previous: A Python Script to Summarize MKL_VERBOSE Output

You face a constant balancing act between features and performance as you build your game. The GPU is the most obvious bottleneck you'll encounter as you add graphical effects and features to your game, but your game can also become CPU-bound.

In addition to the usual CPU loads from game logic, physics and artificial intelligence (AI) calculations, the graphical effects that make your game feel immersive can be CPU-intensive, typically making the bottleneck shift back and forth between the GPU and the CPU throughout development of the game itself.

Modern microprocessors have great single-core performance, but depend on multiple CPU cores to give better overall performance. To use all that available CPU compute power, applications run fastest when they're multithreaded so that code runs concurrently on all available CPU cores.

This video showcases Ashes of the Singularity*, a recent real-time strategy (RTS) game from Oxide Games and Stardock Entertainment. You'll see how it delivers excellent gameplay and performance on systems with more CPU cores.

Figure 1: Ashes of the Singularity* shows how a well-threaded game can get better frame rates on systems with more CPU cores.

By building a new engine and using Direct3D* 12, Oxide made it possible for Ashes of the Singularity to use all available processor cores. It runs great on a typical gaming system and scales up to run even better on systems with more cores. You can use these same techniques in your game to get the best performance from your CPU.

Direct3D* 12 eliminates bottlenecks and allows high performance

To get the best frame rates in Ashes of the Singularity, the Oxide team used Direct3D* version 12. Earlier versions of Direct3D run well, but have a few bottlenecks. In version 12, the API incorporated several changes that remove these bottlenecks that tend to slow down games: multiple objects are now simplified into pipeline state objects; and a smaller hardware abstraction layer minimizes the overhead in the API and makes it possible to remove resource barriers from graphics drivers.

It's possible to create commands from multiple threads in Direct3D 11. However, there's so much serialization required that games never got much speedup from multithreading with the older version. With the API changes in Direct3D 12, this fundamental limit doesn't exist anymore. Without that serialization, it's now practical for games to fill command lists from multiple threads and have much better overall threading.

By taking advantage of these changes to the API, Ashes of the Singularity runs best on Direct3D 12.

The Nitrous* Engine makes it possible

Oxide wanted to create a more complex RTS than any built before, with support for larger armies with more units and larger maps. To build this next-generation RTS game, the development team knew that they needed a new game engine; existing game engines couldn't support the unit counts or map sizes they wanted. They started from scratch to build the Nitrous Engine* to make Ashes of the Singularity possible.

Any new engine must first deliver high-performance rendering. With that in mind, the Nitrous Engine has well-tuned support for the latest graphics APIs, multiple GPUs, and async compute.

The game supports many units for each player, as well as large maps. Simulating the physics of this many in-game objects across a large terrain generates a large CPU load. More importantly, the AI workload is massive since the behavior of each unit needs to be simulated. There's also an emergent property from the large number of units the game supports. With more units, it becomes harder for the player to directly manage units. Oxide built a layered approach to AI where armies cooperate in sensible ways that use the relative strengths of each unit while paying attention to their relative weaknesses.

To make this kind of scale possible, Nitrous threaded their engine by breaking work up into small jobs. The job system is designed for flexibility, and the small jobs can be spread out among as many CPU cores as possible. Oxide carefully tuned the job scheduler for speed since most Intel® processors include Intel® Hyper-Threading Technology, the scheduler also looks for locality between jobs. Jobs that share cached data are scheduled on different logical cores of the same physical CPU core, which is an approach that yields the best performance and job throughput.

Regardless of approach, there will always be a bottleneck somewhere when you add complexity to a game. As you develop your game, think about the relative CPU and GPU loads that you might expect. Understand how your game will work when the GPU is the bottleneck and consider how it will behave when the CPU is the bottleneck.

Intel® Core™ processors make it shine

Intel® Core™ processors can help make your game shine like Ashes of the Singularity. As you design and optimize your game, target mid-range processors and design for scalability up to the most powerful processors.

By using the techniques we describe here, Ashes of the Singularity runs great on the best-in-class Intel® Core™ i7-6950X processor Extreme Edition, which has 10 physical CPU cores and a large cache for the best overall performance. With the work divided into jobs and a great GPU, the game's frame rate increases with more CPU cores. On identical systems with varying numbers of CPU cores, the frame rate improves steadily up to the max of 10 physical cores.

The game also includes some massive maps. Since the player and unit count get so large, the AI burden for a fully-outfitted set of players becomes huge. After careful tuning, Ashes of the Singularity allows these maps only for systems with large numbers of CPU cores (six or more) through the job scheduler, which automatically puts work on all available cores. This is a great approach for you to pursue: detect your system's core count with a function like GetsystemInfo() if you need to selectively enable features.

Scalable effects add some sparkle

Although this game is mostly focused on ever-faster frame rates with more cores, there was a little extra CPU room for some bonuses. With more cores, Ashes of the Singularity will automatically enable advanced particle effects on some units as well as temporal motion blur.

Ashes of the Singularity
Figure 2: Advanced particle effects on two large dreadnought units look awesome.

The particle effects give added visual impact, but they don't affect gameplay.

Ashes of the Singularity
Figure 3: Temporal motion blur adds realism to fast-moving units.

With temporal motion blur, fast-moving units are simulated across multiple frames and then combined in a blur, adding visual realism to these units.

Check it out, and then build your own awesome game!

Ashes of the Singularity delivers excellent performance that scales up with available CPU cores. This is done through using the latest Direct3D API which is multithread-capable, more efficient job partitioning and scheduling, and active detection of core availability to enable more complex features. More cores unlock a more complex gameplay with larger maps. The combination of temporal motion blur and enhanced particle effects give a great-looking game. And now the recently-released expansion Ashes of the Singularity: Escalation builds on these advantages, adding even larger maps, an improved UI, and better performance across different platforms.

We hope you are inspired to apply these design principles to your development project, and create an awesome game!

Graphics tuning? Start with Intel® Graphics Performance Analyzers

Graphics optimization is a large subject in its own right. To get started tuning your game, we recommend the Intel® Graphics Performance Analyzers. Check out these tools at https://software.intel.com/en-us/gpa.

↧

Drexel University

February 24, 2017, 10:23 am

Latest and popular articles on Intel Technologies

≫ Next: Out of the Box Network Developers Newsletter – March 2017

≪ Previous: Why threading matters for Ashes of the Singularity*

Sole

The Team:

Nabeel Ansari – Composer who is responsible for the game’s beautiful music and audio design
Nina DeLucia and Vincent De Tommaso – Artists who work tirelessly to paint, model, and sculpt all of Sole’s art assets
Thomas Sharpe – Creative director and programmer

The Inspiration:

The Game:

Development and Hardware:

↧

Out of the Box Network Developers Newsletter – March 2017

February 28, 2017, 8:25 am

Latest and popular articles on Intel Technologies

≫ Next: Optimizations Enhance Halo Wars* 2 For PCs with Intel Integrated Graphics

≪ Previous: Drexel University

It’s been a busy time for Out of the Box Network Developers and we have lots of news and information to share with you. Read on for information about upcoming events in Santa Clara, Portland, Shannon, and Bangalore, and get recaps of recent meetups and devlabs. Learn about new articles on Intel® Developer Zone, meet new Innovators, and check out some developer projects on DevMesh. To top things off, we’ve added a new section to highlight upcoming Intel® Network Builders University classes and events.

Upcoming Meetups and Devlabs.
- Out of the Box Network Developers Santa Clara.
  - Cloud Networking Deep Dive.
    Thursday, April 6, 2017, 5:30 PM to 9:30.
- Out of the Box Network Developers Portland.
  - Talk about NFV/SDN concepts and overview of ONOS.
    Tuesday, 28 March 2017, 5:00 PM to 10:00 PM
- Out of the Box Network Developers, Bangalore.
  - SDN/NFV industry standard DevOps implementation, Benchmarking and Testing.
    Tuesday, March 7, 2017, 10:00 AM to 3:30 PM.
- Out of the Box Network Developers, Ireland
  - Enabling Virtualization in SDN and NFV world on IA.
    Wednesday, March 29, 2017, 9:30 AM.
New on Intel Developer Zone.
Recap of Our January 2017 Meetups.
- The SDN/NFV Developer Party on January 19th at SC-9 (Santa Clara)
- Enabling Virtualization in SDN and NFV world on IA (Bangalore)
People and Projects.
- Congratulations to our Two New Intel Software Innovators.
- Latest Developer Projects.
Intel® Network Builders University.
- Data Plane Development Kit Courses.

Upcoming Meetups and Devlabs

Out of the Box Network Developers Santa Clara

Cloud Networking Deep Dive
Thursday, April 6, 2017, 5:30 PM to 9:30

SC9-Auditorium, Garage B
2250 Mission College Blvd, Santa Clara, CA

Google’s software-defined global network, underlying network virtualization stack (Andromeda) and network services, along with open innovations provide the foundation for securely delivering a diversity of workloads and services on Google Cloud Platform (GCP). Google Cloud Virtual Private Cloud (VPC) provides you with a secure and flexible sandbox to run your cloud workloads. Global Load Balancing and Cloud CDN deliver global reach, scale and high availability and secure your apps against DDoS attacks. Cloud Interconnect enables seamless connectivity options for hybrid/multi-cloud app delivery. This talk provides an in-depth look at Google Cloud Networking, what's under the hood, the benefits it delivers along with real world GCP customer stories. Presented by Prajakta Joshi, who is a product manager at Google focused on delivering Cloud Networking products to scale, secure and simplify your Google Cloud Platform deployments. Prajakta previously served as Director of Product for ONOS, a Software-Defined Networking (SDN) Control Plane Platform, which she helped productize and launch in open source.

Out of the Box Network Developers Portland

Talk about NFV/SDN concepts and overview of ONOS
Tuesday, 28 March 2017, 5:00 PM to 10:00 PM

Intel Auditorium RA1
2501 NW 229th Ave, Hillsboro, OR

This is our first Meetup! We will talk generally about NFV, SDN, ONOS and our experiences when deploying, running it, favorite flavors of vendors, etc.

7:00 - 7:30 Networking
7:30 - 8:00 Overview of NFV and SDN by Baltazar Ruiz
8:00 - 8:30 Overview of ONOS (Open Network Operating System) by Karla Saur
8:30 - 9:00 Q&A

About Karla Saur - Holds a PhD in Computer Science, currently works as a Research Scientist on Distributed Systems.

About Baltazar Ruiz - Over 15 years of experience as Sr. Network Engineer, currently works as an Application Engineer in Intel.

Out of the Box Network Developers, Bangalore

SDN/NFV industry standard DevOps implementation, Benchmarking and Testing
Tuesday, March 7, 2017, 10:00 AM to 3:30 PM

Increasingly, service providers are creating labs to try out their own SDN/NFV solutions in the lab and Indian Telcom Industry is playing a critical role in this effort. In this meet up key player in this market e.g. Intel, TCS, and Tech Mahindra will talk about the roles they are playing in this effort.

Out of the Box Network Developers, Ireland

Enabling Virtualization in SDN and NFV world on IA
Wednesday, March 29, 2017, 9:30 AM

Irish Aviation Authority Conference Centre
The Times Building, 11-12 D'Olier Street, Dublin 2

Networking has traditionally been entrusted to customized ASICs and vertically integrated software solutions on top of these customized boxes. Virtualizing network functions on commodity x86 based servers brings its own challenges. Sharing resources among virtual functions and added software layers of hypervisor, soft switches, etc. before the packet gets onto the Network Function are some of them. This meet up will focus on giving the attendee a flavor of some of the tools and technologies being to resolve some of these issues from the Industry and Intel. To be presented by Andrew Duignan, who is an Electronic Engineering graduate from University College Dublin, Ireland. He has worked as a software engineer in Motorola and now at Intel Corporation. He is in a Platform Applications Engineering role, supporting technologies such as DPDK and virtualization on Intel CPUs. He is based in the Intel Shannon site in Ireland.

Please register to confirm a spot and a chance to win a 50 euro Amazon gift card.

New on Intel Developer Zone

Analyzing Open vSwitch* with DPDK Bottlenecks Using Intel® VTune™ Amplifier

This article shows how we used Intel® VTune™ Amplifier to identify and fix an MMIO transaction performance bottleneck at the microarchitecture level in OVS-DPDK. By Bhanu Prakash Bodi Reddy and Antonio Fischetti.

Build Your Own Traffic Generator – DPDK-in-a-Box

Build your own DPDK-based traffic generator with a MinnowBoard Turbot or any Intel platform. OS is Ubuntu* 16.04 client with DPDK. Uses the TRex* realistic traffic generator. By M Jay (Muthurajan Jayakumar).

SR-IOV and OVS-DPDK Hands-on Labs

Automate setup of SR-IOV with DPDK in Linux* and configure an OVS-DPDK NFV use case inside nested VMs. Provision into a cluster or datacenter. Includes all the setup scripts used by Clayne to prepare for the hands-on labs at the December 8 Intel® Developer Zone NFV/DPDK Devlab. By Clayne Robison.

Visit our library on Intel Developer Zone to see all of our Networking articles and videos.

Recap of Our January 2017 Meetups

The SDN/NFV Developer Party on January 19th at SC-9 (Santa Clara)

Sixty-three developers trained from 40 companies. This party was a spotlight on project pitches from the Out Of The Box meetup community, ending with a project idea brainstorming session.

There were lightning talks from industry developers from Intel, CableLabs, Apple, Huawei, and ONOS on topics such as RDT in NFV, OPEN-O*, ONOS*, and more.

Three developers from the Out Of the Box community gave prepared talks on ongoing projects stemming from the NFV/DPDK DevLab in December 2016, and one new developer from the audience came up and talked about their project. These developers were given DPDK-in-a-Box dev kits in recognition of their work.

Enabling Virtualization in SDN and NFV world on IA (Bangalore)

One-hundred fifteen developers trained. The first meetup was planned and executed in Bangalore to engage developers from networking companies, CSPs, networking and telecom equipment manufacturers, system integrators, and OEMs that are focused on adopting NFV and SDN technologies.

This meetup was remarkably successful with an audience of 115 developers, architects, and network engineers ranging from 64 different companies, exceeding the original expectation of 50 developers.

People and Projects

Congratulations to our Two New Intel Software Innovators

Anthony Chow (SDN/NFV developer and blogger 6000+ followers)
Shivaram Mysore (Developer for Faucet, which is an open-source SDN controller)

Latest Developer Projects

Subrata G: DPDK Adaptation of libc Socket Calls
Shivaram Mysore: Deploying SDN Wired/Wireless Network
Tharaneedharan Vilwanathan: PerfectStream: A DPDK-based Video Gateway

Intel® Network Builders University

Data Plane Development Kit Courses

The Intel® Network Builders University released a series of new courses this month in the Data Plane Development Kit (DPDK) program. The new material adds a variety of deep-dive technical content to the existing library. Expand your DPDK knowledge and skills by taking the following new courses:

Setting Up DPDK on Different Operating Systems: In this course, you will learn the process of installing DPDK on a variety of operating systems, the installation of DPDK from source code, and the topic of setting up Hugepages.

The DPDK Packet Framework: This course covers the DPDK packet framework and provides example pipelines and use cases.

The DPDK Sample Applications: This course provides a basic introduction to a few of the 40-plus DPDK sample applications available today.

Writing a Simple DPDK Forwarding Application: In this course, Intel Software Engineer, Ferruh Yigit, covers the topic of writing a simple DPDK forwarding application.

In addition to the new deep dive DPDK content, the Intel Network Builders University is releasing a new course on containers in the Management and Orchestration program titled "Container Orchestration with Kubernetes."* To access the material, simply log in or register on the Intel Network Builders University page.

↧

Optimizations Enhance Halo Wars* 2 For PCs with Intel Integrated Graphics

March 1, 2017, 3:43 pm

Latest and popular articles on Intel Technologies

≫ Next: IDZ Production Workflow - New Project

≪ Previous: Out of the Box Network Developers Newsletter – March 2017

Download Document PDF 1.35MB

The Mission

When top UK-based studio Creative Assembly* began their ambitious work on Halo Wars* 2, they wanted the game to run on a variety of settings supported by DirectX* 12, and to be playable up and down the hardware ladder—including advanced desktop PC configurations and laptops. While many of the optimizations also enhanced the game for high-end systems with discrete graphics cards, this white paper will explore the team’s efforts for Intel integrated graphics and multicore processing functions.

Extending a Franchise

First-person shooter Halo* is one of the most popular franchises in PC gaming history; it started in 2001 with Halo: Combat Evolved, one of the original Xbox* launch titles. By late 2015, Halo had generated over USD $5 billion in lifetime game and hardware sales. Development work for Halo Wars 2 was performed by Creative Assembly, veterans of Alien: Isolation* and Total War: Warhammer*. They were experienced at producing games for multiple platforms; but, with DirectX 12 still maturing, the team faced challenges with their engine, the DX 12 driver, multicore efficiency, and more.

Figure 1. Halo Wars* 2 is the latest addition to the Halo* universe.

Switching the game to an RTS title meant tracking more units and packing the interface with crucial statistics, while updating the “mini-map” constantly. Halo Wars 2 also introduces “Blitz” mode—a new, innovative and action-packed twist on RTS gameplay that combines card-based strategy with explosive combat. The new mode also streamlines most of the traditional RTS systems, such as base-building, skills development, and resource management.

Expanding the User Base

Michael Bailey, Lead Engine Programmer at Creative Assembly, was one of the principal developers involved.

“Given the nature of RTS games, unit counts and large-scale battles with plenty of VFX were always a big focus,” he explained. “This, combined with a deterministic networking model common to RTS games, put restrictions on how scalable we could be, so we pushed from the start to ensure we had a good base to make use of multiple CPU cores. Our primary goal here was to ensure the game ran on a wide range of hardware, and the Microsoft Surface* Pro 4 laptop was a natural to target, aiming for 30 fps and still looking great.”

Figure 2. Dazzling particle effects keep the screen display busy.

Excellence across a wide spectrum of hardware was key. There is a temptation to develop early versions of a game to initially fit the specs for a high-end desktop system, taking full advantage of 10 teraflops of GPU potential. But this makes later scaling the game to Xbox One* and Ultrabook™ devices tricky, as early design choices could limit optimization options. With top-end Ultrabooks and laptops nearing the Xbox One in power and performance, those devices are now considered mainstream. That means developers can no longer concentrate on either CPU or GPU optimizations—both are crucial.

Figure 3. Getting the game to play on Ultrabook™ devices and laptops was a key task.

Bailey had not specifically optimized previous games for Intel® HD graphics, but working directly with Intel allowed him to get up to speed quickly. Fortunately, the graphics systems used by Creative Assembly were designed to be scalable without hurting the overall look and feel. Intel engineers talked Bailey’s team through their tools, on site, to diagnose issues and measure performance for added features. As the DX 12 version became more mature, areas for improvement were pinpointed. This included testing the performance of the game across multicore machines to ensure acceptable decent scaling.

The Game is On

In the early stages of the project, the team encountered several issues:

Instability on the target system
Speed issues at low settings on a discrete GPU
Severe I/O lag
Tools failure on the Universal Windows Platform (UWP) and DirectX 12
Driver Issues
Corruption due to buffers being reused before completion

None of the problems turned out to be show-stoppers, but the early screenshots revealed that there was a lot of work to do.

Figure 4. Early screenshots revealed corruption on a wide scale, affecting terrain, units, and the “mini-map.”

Multicore Optimizations

The team worked on several different areas to optimize the game’s multicore support:
Improve algorithms
Reduce memory allocations
Streamline assets
Perform low-level optimization
Enhance parallelization

In multiplayer battles, there were challenges ensuring that the simulation—spread over multiple threads—remained deterministic. On the CPU side, the team concentrated on being able to split the simulation while ensuring determinism. In addition, specific to DirectX 12, they worked to minimize resource barriers and redundant work.

The team found that changing the order of something meant there was a chance that they could end up with each client diverging and having a different representation of the Halo Wars 2 world. If two or more clients disagreed on a checksum, for example, the result was a “desync” that caused a player to be kicked out. One cause was a race condition where the output is dependent on the sequence or timing of other uncontrollable events. The team soon learned what calculations could put be put onto other threads, when they could run, and when it was safe to do that.

They eventually reached a stage where the CPU side was efficiently running across multiple threads, well apart from the render thread. Their work on multicore optimization will be presented at the 2017 Game Developers Conference (see link, and list below.

GPU Optimizations and Diagnostics

The Creative Assembly team performed multiple GPU optimizations, working closely with Intel® Graphics Performance Analyzers (Intel® GPA). These are powerful, agile tools which enable game developers to utilize the full performance potential of their gaming platform, including Intel® Core™ processors and Intel® HD Graphics, as well as Intel® architecture-based tablets running the Android* operating system. Intel Graphics Performance Analyzers visualize performance data from the application, enabling developers to understand system-level and individual frame performance issues, as well as allowing “what-if” experiments to estimate potential performance gains from optimizations.

The Creative Assembly team used new features of Intel GPA Monitor to launch and profile the game in action. They also examined problematic frames with Intel GPA Frame Analyzer to identify graphics “hot spots” in a particular scene.

Intel GPA tools also assisted investigations into corrupt terrain tiles, which the team tracked down to an error with descriptor tables. In addition, they investigated performance bottlenecks and judged performance versus quality tradeoffs for code that controlled terrain tessellation, as well as texture resolution bandwidth bottlenecks.

Bailey and the team also used Microsoft GPUView and other internal game profilers. The team scaled down their graphical options in the following areas:

Disabling Async Compute when it ended up using too many synchronization primitives to guard between multiple command lists. (This slowed things down on hardware where Async Compute isn’t natively supported.)
Reducing the terrain tessellation amounts for quality versus performance scalability reasons; on smaller screens, the extra detail afforded by tessellation wasn’t worth the cost.
Dropping the terrain compositing tiled resource map sizes down to match the resolution the game was running in. This helped to avoid compositing terrain textures at resolutions that were too high for the display.
Correcting the pixel shader system. The team had assumed that as the pixel shader wasn’t outputting anything within their shadow-rendering Pipeline State Objects (PSOs), it would be stripped along with interpolators from the Vertex shader. Intel GPA revealed that wasn’t the case.
Tweaking the small-object culling factor to account for resolution (both in the main scene and in the shadows where the shadow resolution was much smaller).
Swapping the heavy shadow PCF filter kernel to use GatherCmp to massively reduce the cost of that shader with no visual difference.
Removing redundant terrain composition layers (a bug was causing them to export all layers, even unnecessary ones).
Implementing a dynamic particle-reduction system which prioritized battle visual effects and dropped less important “incidental” environmental effects, or heavier “high-end” effects such as particle lights.
Players can identify units by what visual effects they saw, or what they look like against the terrain. The optimizations Bailey and his team implemented had to keep the display readable without decimating the frame and disabling certain effects. “If the scene gets too heavy, we start ramping down the particle effects,” he said.

They also ensure that the battle effects look amazing. When there were 20 explosions on top of each other, they started turning off environmental effects and the less important visual effects. “Those are based upon on the current particle system load,” he explained. “On the Microsoft Surface Pro 4, they’re probably dropped down a bit more than they would drop down on a higher-end system which can handle more of a load. You get the environmental effects staying on, but they’re not really a big part of the battle.”

Figure 5. Big battles contain multiple units, plenty of particle effects, and lots of information to process.

DirectX 12 Challenges

Creative Assembly’s technical base is separate from the Total War: Warhammer engine, and the graphics layer is brought over from Alien: Isolation. They wrote the DX 12 graphics layer, building on top of existing DX 11 support, to take advantage of newer features and ensure it was more optimal than DX 11.

When they started, the graphics layer was targeting DX 11 for PC, so planning for and moving to DX 12 exclusively gave them plenty of opportunities to maximize performance. Bailey discovered during development that they were “over-cautious” with fences and descriptor table invalidations to ensure correctness and stability. They studied several DX 12 talks at GDC 2016 and gathered more information from online sources, until they understood the best way to take advantage of it in their engine.

As the system became more mature and stable, they started to optimize to ensure they removed redundant descriptor settings and unnecessary resource barriers and fences. This, combined with render pass changes to track the resource dependencies better, meant that they were able to shift some of their command-list building passes, such as shadows, to run in parallel with other areas and make more use of multiple cores. The command lists they moved to run in parallel were heavy on the CPU side due to the number of draw calls. These passes increased the cost with the number of units fighting on-screen, so they benefitted greatly from this optimization during heavy battle sequences. Their DX 12 version soon became faster than DX 11 for CPU-side rendering, even single-threaded.

Figure 6. In this screenshot, the "mini-map" in the bottom right corner is fully functional.

The trade-off was that the DX 12 environment was very new: enabling DX 12 validation layers to get diagnostic warnings and errors was not yet mature. The documentation was incomplete, or inaccurate. When things went wrong, Bailey’s team wasn’t sure if it was their fault, or if the driver was to blame.

Eventually, they were developing only on DX 12 and advancing to a single low-level graphics API gave them more control over the performance trade-offs they could make. This allowed the team to target a large number of Windows* 10 devices with a single code-path. The beta released in January 2017 supports DirectX 12 down to feature level 11.0; the game’s terrain texture compositing code requires “Tiled Resources Tier 1” support.

Challenges of Evolving Software

Because there were so many new features they wanted to exploit in the game, the team had to continually update drivers, compilers, and Windows 10 versions. Fortunately, they were in direct contact with various hardware vendors, which meant they had access to beta drivers with specific fixes, as well as beta versions of their internal tools.

Over time, the drivers, tools, and validation layers became more mature, so when errors did occur they were usually from the game code and were easier to track down. The graphics testbed helped debug a wide set of DX 12 API functionality, from developing new features to working in a much more cut-down environment where they were in control of all the variables.

Figure 7. Cut-down environment for testing new features.

Advice for Developers

Reach Out to the Experts

Bailey advises developers to reach out to the Independent Hardware Vendors (IHVs) for key insights on problems. “They obviously want your game working well on their hardware, and they will give you help, as well as tools,” he said. “If you can get them to take an interest in your game, or you’re taking interest in their hardware, then it’s a mutually beneficial experience.” Part of the payoff is access to early fixes and improvements, and optimization feedback, which Bailey’s team used extensively.

“I don’t think we ever got conflicting feedback,” Bailey said. The vendors were helpful in identifying known issues, and, in some places, the game works slightly differently when it identifies hardware from Intel, AMD*, or NVIDIA*. “They've all got their own slight differences, but talking to them is how you find out what they’re good at,” Bailey said. Overall, the result was a faster game, especially on the lower-end hardware.

Create a Robust Test Lab

The Creative Assembly engine team has six people attacking problems and testing solutions in a variety of ways. They often borrow machines as needed and rely heavily on remote debugging to a second machine or Surface Pro 4 at their desk—this makes working on a touchscreen device much simpler. “I’m not sitting there trying to use the touchscreen to try and get it to fit on the screen,” Bailey said. “I can comfortably use my big desktop to remote debug.”

Microsoft owns the rights to the Halo franchise, so Creative Assembly had access to Microsoft's compact testing lab through that relationship. It was a gateway to all sorts of different varieties of hardware, some below the minimum specs. Being able to regularly test builds increased their coverage and ensured that Creative Assembly’s optimizations were thoroughly checked.

Share Knowledge

Bailey encourages developers to reach out and participate within the community, especially by attending events such as the Game Developers Conference (GDC). If that’s not possible, Bailey suggests “downloading the talks to catch up on them later.”

Previously, there were only a few DirectX 12 talks. This year, there will be more tips and tricks shared by experienced teams. “You want to share your successes as well,” Bailey said. “If you share your success, then people will come and talk to you about how they did something differently.”

It’s like a multi-threaded attack on a problem, but with people instead of processors. “When you share these things, you invite the conversation,” Bailey explained. “And inviting that conversation means you might find out a better way of doing something. Maybe your approach was pretty good already, but things can always get better and better.”

Additional Resources

Intel Graphics Performance Analyzers:
https://software.intel.com/en-us/gpa

Intel Multicore FAQ:
https://software.intel.com/en-us/articles/frequently-asked-questions-intel-multi-core-processor-architecture

Introduction to the Universal Windows Platform:
https://docs.microsoft.com/en-us/windows/uwp/get-started/universal-application-platform-guide

DirectX 12 Installer:
https://www.microsoft.com/en-us/download/details.aspx?id=35

Microsoft Surface Pro 4:
https://www.microsoftstore.com/store/msusa/en_US/pdp/productID.5072641000

CPU and GPU Optimization Talk at GDC 2017:
http://schedule.gdconf.com/session/threading-your-way-through-the-tricks-and-traps-of-making-a-dx12-sequel-to-halo-wars-presented-by-intel

↧

IDZ Production Workflow - New Project

March 2, 2017, 3:50 pm

Latest and popular articles on Intel Technologies

≫ Next: Intel® Advisor Roofline

≪ Previous: Optimizations Enhance Halo Wars* 2 For PCs with Intel Integrated Graphics

Nullam id dolor id nibh ultricies vehicula ut id elit. Nullam quis risus eget urna mollis ornare vel eu leo. Donec id elit non mi porta gravida at eget metus. Aenean eu leo quam. Pellentesque ornare sem lacinia quam venenatis vestibulum.

Project Requested

STAKEHOLDER submits ticket in YouTrack or has requested a new space on IDZ.

PROJECT MANAGER creates a kickoff meeting.

Kickoff Meeting & Master Tickets Created

Kickoff meeting with STAKEHOLDER, EDITORIAL, and UX to define scope of project and schedule.

If scope is not finalized, a discovery meeting is set to define scope. If scope is finalized, the following artifacts are created:

Master Ticket created with the following structure (please duplicate this EXAMPLE MASTER TICKET):
- Overview Scope
  - New/migration/content update?
  - No. of pages (approx)
- Stakeholders
- BC Team
  - PM assigned
- Site Content
  - Sitemap
  - GatherContent
  - Wireframe
  - Design Comps
  - Preview URL section
- Schedule
  - Initial schedule is created and validated offline
  - Training meeting setup for tools (if needed)
    - GatherContent
    - Youtrack
    - Author Training
    - Book Training

Sitemap Created

UX creates sitemap (this sometimes happens in kickoff, but may need to be post meeting) and adds links to master ticket.

Example:

[insert image of sitemap]

GatherContent Created

EDITORIAL creates GatherContent project based on UX sitemap, adds links to copy ticket and assigns ticket to PM.

If any training is required, EDITORIAL arranges a GatherContent training session.

Content Outline Created

STAKEHOLDER adds a content outline to GatherContent. This will help inform the wireframes and copy guidance.

STAKEHOLDERS provide all visual assets for DESIGN team to reference.

Example: [link to a GC space with an example?]

Wireframes Created

UX creates wireframes based on the copy outline and adds links to design ticket.

Copy & Initial Design Comps Created

STAKEHOLDER adds copy to GatherContent and obtains all team and legal reviews or approvals. This copy should be created according to design layout, trademark and branding, and style guidelines.

Copy should be at 90% complete.

PROJECT MANAGER watches status of GC and alerts EDITORIAL when copy is ready for editor review.

AT THE SAME TIME...

DESIGN creates comps based on the wireframes and content outline. If there are any issues, DESIGN waits till the initial copy has been added to GC (for problematic pages). DESIGN updates the design ticket.

DESIGN presents comps to STAKEHOLDER for validation.

Copy Review

EDITORIAL updates workflow status in GatherContent through all stages of editing, and then changes status to “Approved” when complete.

EDITORIAL informs design if any issues may be present.

EDITORIAL updates copy ticket with appropriate information and assigns to PROJECT MANAGER.

During the time EDITORIAL is reviewing final content, DESIGN should verify that no changes are needed to the design comps.

If excessive changes are required, DESIGN lets the PROJECT MANAGER know.

If design comps are good to go, DESIGN preps assets for upload and assigns design ticket to PROJECT MANAGER.

Web Production Tickets Created

PROJECT MANAGER creates webops tickets at this point. This ensures that we don't have a backlog of tickets waiting while content is being finalized.
PROJECT MANAGER informs DESIGN and EDITORIAL and requests they add assets and GC links to appropriate tickets.

PROJECT MANAGER assigns tickets to WEBOPS.

Web Pages Built

WEBOPS builds pages.

WEBOPS adds links to design ticket and sends pages to UX and EDITORIAL for a QA review before stakeholder review.

Web Page Reviews

EDITORIAL and DESIGN review layout and content to ensure everything matches comps and GC.

Once Complete,DESIGN & EDITORIAL inform the PROJECT MANAGER.

The PROJECT MANAGER sends links to STAKEHOLDER for review and updates master ticket with details.

STAKEHOLDER reviews links in design ticket and feedback (changes are limited to legal or factual errors) is provided in the following tools:

GatherContent for copy changes YouTrack Master Ticket for all other changes

Final Updates

UX, EDITORIAL, DESIGN, and WEBOPS update per feedback and mark as "ready to launch" in appropriate webops ticket.

Go No-go and Launching

PROJECT MANAGER schedules a "Go No-Go" meeting with all team members including STAKEHOLDER.

Any changes during this meeting should be focused on

Will we get sued if this goes live? Will the STAKEHOLDER get fired if this goes live? Is some fact just plain wrong?

PROJECT MANAGER holds all webops tickets for localization while the site stabilizes (typically 2 weeks unless otherwise specified).

If links work and are correct If content is correct If last minute changes are complete If all the redirects are set up

WEBOPS makes as many changes as they can during this meeting.

Sometimes the Go-No Go meeting becomes the launch call if there are no issues with content and the pages have no embargo date.

The PROJECT MANAGER schedules a launch call (if one isn't already scheduled).

THE LAUNCH CALL
All team members attend and review content as it is pushed live.

Team will check:

If links work and are correct If content is correct

If last minute changes are complete
If all the redirects are set up

PM holds all webops tickets for localization while the site stabilizes (typically 2 weeks unless otherwise specified).

↧