Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases now! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Unity 5  Game Optimization

You're reading from   Unity 5 Game Optimization Master performance optimization for Unity3D applications with tips and techniques that cover every aspect of the Unity3D Engine

Arrow left icon
Product type Paperback
Published in Nov 2015
Publisher Packt
ISBN-13 9781785884580
Length 296 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
Chris Dickinson Chris Dickinson
Author Profile Icon Chris Dickinson
Chris Dickinson
Arrow right icon
View More author details
Toc

Table of Contents (10) Chapters Close

Targeted profiling of code segments

If our performance problem isn't solved by the above checklist, then we probably have a real issue on our hands that demands further analysis. The task of figuring out exactly where the problem is located still remains. The Profiler window is effective at showing us a broad overview of performance; it can help us find specific frames to investigate and can quickly inform us which MonoBehaviour and/or method may be causing issues. However, we must still determine exactly where the problem exists. We need to figure out if the problem is reproducible, under what circumstances a performance bottleneck arises, and where exactly within the problematic code block the issue is originating from.

To accomplish these, we will need to perform some profiling of targeted sections of our code, and there are a handful of useful techniques we can employ for this task. For Unity projects, they essentially fit into two categories:

  • Controlling the Profiler from script code
  • Custom timing and logging methods

Note

Note that the following section mostly focusses on how to investigate Scripting bottlenecks through C# code. Detecting the source of bottlenecks in other engine components will be discussed in their related chapters.

Profiler script control

The Profiler can be controlled in script code through the static Profiler class. There are several useful methods in the Profiler class that we can explore within the Unity documentation, but the most important methods are the delimiter methods that activate and deactivate profiling at runtime: Profiler.BeginSample() and Profiler.EndSample().

Tip

Note that the delimiter methods, BeginSample() and EndSample(), are only compiled in development builds, and as such they cause no overhead in the final build. Therefore, it is safe to leave them in our codebase if we wish to use them for profiling tests at a later date.

The BeginSample() method has an overload that allows a custom name for the sample to appear in the CPU Usage Area's Hierarchy Mode view. For example, the following code will profile invocations of this method and make the data appear in the Breakdown View under a custom heading:

void DoSomethingCompletelyStupid() {
  Profiler.BeginSample("My Profiler Sample");

  List<int> listOfInts = new List<int>();
  for(int i = 0; i < 1000000; ++i) {
    listOfInts.Add(i);
  }

  Profiler.EndSample();
}

Tip

Downloading the example code

You can download the example code files from your account at http://www.packtpub.com for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

We should expect that invoking this poorly designed method (it generates a list containing a million integers, and then does absolutely nothing with it) will cause a huge spike in CPU usage, chew up several Megabytes of memory, and appear in the Profiler Breakdown View under the heading My Profiler Sample as the following screenshot shows:

Profiler script control

Note that these custom sample names do not appear at the root of the hierarchy when we perform Deep Profiling. The following screenshot shows the Breakdown View for the same code under Deep Profiling:

Profiler script control

Note how the custom name for the sample does not appear at the top of the sample, where we may expect it to. It's unclear what causes this phenomenon, but this can cause some confusion when examining the Deep Profiling data within Hierarchy Mode, so it is good to be aware of it.

Custom CPU Profiling

The Profiler is just one tool at our disposal. Sometimes, we may want to perform customized profiling and logging of our code. Maybe we're not confident the Unity Profiler is giving us the right answer, maybe we consider its overhead cost too great, or maybe we just like having complete control of every single aspect of our application. Whatever our motivations, knowing some techniques to perform an independent analysis of our code is a useful skill to have. It's unlikely we'll only be working with Unity for the entirety of our game development careers, after all.

Profiling tools are very complex, so it's unlikely we would be able to generate a comparable solution on our own within a reasonable time frame. When it comes to testing CPU usage, all we need is an accurate timing system, a fast, low-cost way of logging that information, and some piece of code to test against. It just so happens that the .NET library (or, technically, the Mono framework) comes with a Stopwatch class under the System.Diagnostics namespace. We can stop and start a Stopwatch object at any time, and we can easily acquire a measure of how much time has passed since the Stopwatch was started.

Unfortunately, this class is not very accurate—it is accurate only to milliseconds, or tenths of a millisecond at best. Counting high-precision real time with a CPU clock can be a surprisingly difficult task when we start to get into it; so, in order to avoid a detailed discussion of the topic, we should try to find a way for the Stopwatch class to satisfy our needs.

Before we get obsessed with the topic of high precision, we should first ask ourselves if we even need it. Most games expect to run at 30 FPS (frames-per-second) or 60 FPS, which means they only have around 33 ms or 16 ms, respectively, to compute everything for the entire frame. So, hypothetically, if we only need to bring the performance of a particular code block under 10ms, then repeating the test thousands of times to get microsecond precision wouldn't really tell us anything useful.

However, if precision is important, then one effective way to increase it is by running the same test multiple times. Assuming that the test code block is both easily repeatable and not exceptionally long, then we should be able to run thousands, or even millions, of tests within a reasonable timeframe and then divide the total elapsed time by the number of tests we just performed to get a more accurate time for a single test.

The following is a class definition for a custom timer that uses a Stopwatch to count time for a given number of tests:

using UnityEngine;
using System;
using System.Diagnostics;
using System.Collections;

public class CustomTimer : IDisposable {
  private string m_timerName;
  private int m_numTests;
  private Stopwatch m_watch;

  // give the timer a name, and a count of the number of tests we're running
  public CustomTimer(string timerName, int numTests) {
    m_timerName = timerName;
    m_numTests = numTests;
    if (m_numTests <= 0)
      m_numTests = 1;
    m_watch = Stopwatch.StartNew();
  }

  // called when the 'using' block ends
  public void Dispose() {
    m_watch.Stop();
    float ms = m_watch.ElapsedMilliseconds;
    UnityEngine.Debug.Log(string.Format("{0} finished: {1:0.00}ms total, {2:0.000000}ms per test for {3} tests", m_timerName, ms, ms / m_numTests, m_numTests));
  }
}

The following is an example of the CustomTimer class usage:

int numTests = 1000;

using (new CustomTimer("My Test", numTests)) {
  for(int i = 0; i < numTests; ++i) {
    TestFunction();
  }
} // the timer's Dispose() method is automatically called here

There are three things to note when using this approach. Firstly, we are only making an average of multiple method invocations. If processing time varies enormously between invocations, then that will not be well-represented in the final average. Secondly, if memory access is common, then repeatedly requesting the same blocks of memory will result in an artificially higher cache hit rate, which will bring the average time down when compared to a typical invocation. Thirdly, the effects of JIT compilation will be effectively hidden for similarly artificial reasons as it only affects the first invocation of the method. JIT compilation is something that will be covered in more detail in Chapter 7, Masterful Memory Management.

The using block is typically used to safely ensure that unmanaged resources are properly destroyed when they go out of scope. When the using block ends, it will automatically invoke the object's Dispose() method to handle any cleanup operations. In order to achieve this, the object must implement the IDisposable interface, which forces it to define the Dispose() method.

However, the same language feature can be used to create a distinct code block, which creates a short-term object, which then automatically processes something useful when the code block ends.

Note

Note that the using block should not be confused with the using statement, which is used at the start of a script file to pull in additional namespaces. It's extremely ironic that the keyword for managing namespaces in C# has a naming conflict with another keyword.

As a result, the using block and the CustomTimer class give us a clean way of wrapping our target test code in a way which makes it obvious when and where it is being used.

Another concern to worry about is application warm up. Unity has a significant startup cost when a Scene begins, given the number of calls to various GameObjects' Awake() and Start() methods, as well as initialization of other components such as the Physics and Rendering systems. This early overhead might only last a second, but that can have a significant effect on the results of our testing. This makes it crucial that any runtime testing begins after the application has reached a steady state.

If possible, it would be wise to wrap the target code block in an Input.GetKeyDown() method check in order to have control over when it is invoked. For example, the following code will only execute our test method when the Space Bar is pressed:

if (Input.GetKeyDown(KeyCode.Space)) {
  int numTests = 1000;

  using (new CustomTimer("Controlled Test", numTests)) {
    for(int i = 0; i < numTests; ++i) {
      TestFunction();
    }
  }
}

There are three important design features of the CustomTimer class: it only prints a single log message for the entire test, only reads the value from the Stopwatch after it has been stopped, and uses string.Format() for generating a custom string.

As explained earlier, Unity's console logging mechanism is prohibitively expensive. As a result, we should never use these logging methods in the middle of a profiling test (or even gameplay, for that matter). If we find ourselves absolutely needing detailed profiling data that prints out lots of individual messages (such as performing a timing test on each iteration of a loop, to find which iteration is costing more time than the rest), then it would be wiser to cache the logging data and print it all at the end, as the CustomTimer class does. This will reduce runtime overhead, at the cost of some memory consumption. The alternative is that many milliseconds are lost to printing each Debug.Log() message in the middle of the test, which pollutes the results.

The second feature is that the Stopwatch is stopped before the value is read. This is fairly obvious; reading the value while it is still counting might give us a slightly different value than stopping it first. Unless we dive deep into the Mono project source code (and the specific version Unity uses), we might not know the exact implementation of how Stopwatch counts time, at what points CPU ticks are counted, and at what moments any application context switching is triggered by the OS. So, it is often better to err on the side of caution and prevent any more counting before we attempt to access the value.

Finally, there's the usage of string.Format(). This will be covered in more detail in Chapter 7, Masterful Memory Management, but the short explanation is that this method is used because generating custom strings using the +operator results in a surprisingly large amount of memory consumption, which attracts the attention of the garbage collector. This would conflict with our goal of achieving accurate timing and analysis.

You have been reading a chapter from
Unity 5 Game Optimization
Published in: Nov 2015
Publisher: Packt
ISBN-13: 9781785884580
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime