Introduction to pyDAAL

May 16, 2017, 1:14 pm

Latest and popular articles on Intel Technologies

≪ Previous: Deploying BigDL on Microsoft’s Azure* Data Science Virtual Machine

This paper shows how the python* API of the Intel® Data Analytics Acceleration Library (Intel® DAAL) tool works. First, we explain how to manipulate data using the pyDAAL programming interface and then show how to integrate it with python data manipulation/math APIs. Finally, we demonstrate how to use pyDAAL to implement a simple Linear Regression solution for a prediction problem.

Data Science is a new recent field that put together lots of concepts of other areas such as: Data mining, Data Analysis, Data modeling, Data Prediction, Data Visualization and so on. The need for performing such tasks as quickly as possible has become the main issue in today's data solutions. With that in mind, the Intel DAAL, is a highly optimized library whose goal is to provide a full solution for data analytics targeting today's highly parallel systems such as Intel® Xeon Phi™ processors.

Intel DAAL delivers solutions for many steps of a data analytics pipeline, such as pre-processing, data transformations, dimensionality reduction, data modeling, prediction, and several drivers for reading and writing in most of the common data formats. A summary of all features inside the library can be seen in Figure 1.

Image may be NSFW.
Clik here to view.

Figure 1. Main algorithms delivered by Intel® Data Analytics Acceleration Library

As can be seen in Figure 1, all APIs are compatible with C++, Java*, and Python* (a recent addition available from version 2017 beta). Many of the algorithms implemented inside the tool can be executed in 3 main modes:

Batch: in this mode, the processing occurs in a serial way, e.g., the training algorithm is executed in a single node sequentially;
Distributed: as the name suggests, in this processing mode, the dataset must be split and distributed among the computing nodes. The algorithm then calculate partial solutions and, at the last step, unifies such solutions; and
Online: in this processing mode, the data is considered as being a continuous stream. The processing occurs by building incremental models, and, at the end, building a full model from the partial models.

More on the processing modes, together with additional details on Data Management and how to use pyDAAL to implement a simple Linear Regression solution for a prediction problem are covered in this whitepaper.

Source available on GitHub

↧

Vector (SIMD) Function ABI

May 17, 2017, 3:25 pm

Latest and popular articles on Intel Technologies

≫ Next: Why & When Deep Learning Works: Looking Inside Deep Learning

≪ Previous: Introduction to pyDAAL

Vector Function Application Binary Interface

adapted from version of November 2015 by

Xinmin Tian, Hideki Saito, Sergey Kozhukhov, Kevin B. Smith,
Robert Geva, Milind Girkar and Serguei V. Preis
Intel® Mobile Computing and Compilers

Please see attachment.

↧

Why & When Deep Learning Works: Looking Inside Deep Learning

May 17, 2017, 2:18 pm

Latest and popular articles on Intel Technologies

≫ Next: Designing Scalable IoT Architectures

≪ Previous: Vector (SIMD) Function ABI

Ronny Ronen
The Intel Collaborative Research Institute for Computational Intelligence (ICRI-CI)¹

In recent years, Deep Learning has emerged as the leading technology for accomplishing broad range of artificial intelligence tasks (LeCun et al. (2015); Goodfellow et al. (2016)). Deep learning is the state-of-the-art approach across many domains, including object recognition and identification, text understating and translation, question answering, and more. In addition, it is expected to play a key role in many new usages deemed almost impossible before, such as fully autonomous driving.

While the ability of Deep Learning to solve complex problems has been demonstrated again and again, there is still a lot of mystery as to why it works, what is it really capable of accomplishing, and when it works (and when it does not). Such an understanding is important for both theoreticians and practitioners, in order to know how such methods can be utilized safely and in the best possible manner. An emerging body of work has sought to develop some insights in this direction, but much remains unknown. The general feeling is that Deep learning is still by and large “black magic” we know it works, but we do not truly understand why. This lack of knowledge disturbs the scientists and are a cause for concern for developers would you let an autonomous car be driven by a system whose mechanisms and weak spots are not fully understood?

The Intel Collaborative Research Institute for Computational Intelligence (ICRI-CI) has been heavily supporting Machine Learning and Deep Learning research from its foundation in 2012. We have asked six leading ICRI-CI Deep Learning researchers to address the challenge of “Why & When Deep Learning works”, with the goal of looking inside Deep Learning, providing insights on how deep networks function, and uncovering key observations on their expressiveness, limitations, and potential.

The output of this challenge call was quite impressive, resulting in five papers that address different facets of deep learning. These papers summarize the researchers’ ongoing recent work published in leading conferences and journals as well as new research results made especially for this compilations. These different facets include a high-level understating of why and when deep networks work (and do not work), the impact of geometry on the expressiveness of deep networks, and making deep networks interpretable.

Understating of why and when deep networks work (and do not work)

Naftali Tishby and Ravid Schwartz-Ziv in Opening the Black Box of Deep Neural Networks via Information study Deep Networks by analyzing their information-theoretic properties, looking at what information on the input and output each layer preserves, and suggests that the network implicitly attempts to optimize the Information-Bottleneck (IB) tradeoff between compression and prediction, successively, for each layer. Moreover, they show that the stochastic gradient descent (SGD) epochs used to train such networks have two distinct phases for each layer: fast empirical error minimization, followed by slow representation compression. They then present a new theoretical argument for the computational benefit of the hidden layers.
Shai Shalev-Shwartz, Ohad Shamir and Shaked Shamma in Failures of Gradient-Based Deep Learning attempt to gain a deeper understanding of the difficulties and limitations associated with common approaches and algorithms. They describe four families of problems for which some of the commonly used existing algorithms fail or suffer significant difficulty, illustrate the failures through practical experiments, and provide theoretical insights explaining their source and suggest remedies to overcome the failures that lead to performance improvements.
Amnon Shashua, Nadav Cohen, Or Sharir, Ronen Tamari, David Yakira and Yoav Levine in Analysis and Design of Convolutional Networks via Hierarchical Tensor Decompositions analyze the expressive properties of deep convolutional networks. Through an equivalence to hierarchical tensor decompositions, they study the expressive efficiency and inductive bias of various architectural features in convolutional networks (depth, width, pooling geometry, inter-connectivity, overlapping operations etc.). Their results shed light on the demonstrated effectiveness of convolutional networks, and in addition, provide new tools for network design.

The impact of geometry on the expressiveness of deep networks
Nathan Srebro, Behnam Neyshabur, Ryota Tomioka and Ruslan Salakhutdinov in Geometry of Optimization and Implicit Regularization in Deep Learning argue that the optimization methods used for training neural networks play a crucial role in generalization ability of deep learning models, through implicit regularization. They demonstrate that generalization ability is not controlled simply by network size, but rather by some other implicit control. Then, by studying the geometry of the parameter space of deep networks and devising an optimization algorithm attuned to this geometry, they demonstrate how changing the empirical optimization procedure can improve generalization performance.

Interpretability of deep networks
Shie Mannor, Tom Zahavy and Nir Baram in Graying the black box: Understanding DQNs present a methodology and tools to analyze Deep Q-networks (DQNs) in a non-blind matter. They propose a new model, the Semi Aggregated Markov Decision Process (SAMDP), and an algorithm that learns it automatically. Using these tools they reveal that the features learned by DQNs aggregate the state space in a hierarchical fashion, explaining its success. Moreover, they are able to look into the network to understand and describe the policies learned by DQNs for three different Atari2600 games and suggest ways to interpret, debug and optimize deep neural networks in reinforcement learning.

References

Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep learning. MIT Press, 2016.

Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. Nature, 521(7553): 436–444, 2015.

1 This work was done with the support of the Intel Collaborative Research institute for Computational Intelligence (ICRI-CI). This paper is the preface part of the ’Why & When Deep Learning works looking inside Deep Learning’ ICRI-CI paper bundle.

↧

Designing Scalable IoT Architectures

May 15, 2017, 1:42 pm

Latest and popular articles on Intel Technologies

≫ Next: The Evil within the Comparison Functions

≪ Previous: Why & When Deep Learning Works: Looking Inside Deep Learning

Designing for the Internet of Things is challenging. The technology is rapidly changing, and architecting for these situations can be complex. This article will discuss both design considerations for IoT and new methods in creating a robust network using Intel® processors.

Latency, Bandwidth and Reliability

Design practices for Internet of Things (IoT) devices are changing. It used to be that developers just watched processes from afar, but now we control them in real-time. A result of this change has been an increase in IoT network complexity. For IoT devices that depend on Internet access, this can result in several challenges when it comes to network paths to cloud servers: high latencies, low bandwidths, and decreased reliability.

These trends have led to new topologies in IoT networks, such as Fog Computing (a network layer below the cloud). Deploying cloud elements closer to the edge of the network (or even onsite) reduces latencies while also preventing bandwidth bottlenecks. In order to achieve these goals, edge networks and Fog Computing require high-performance computing resources, as well as high-speed storage and networking.

Scalable Design

The challenge for IoT is twofold:

1) Design scalable and reliable devices

2) Architect flexible cloud elements with the lowest possible latency, highest bandwidth, and best reliability possible.

IoT Design with Intel® Processors

Intel supplies a wide range of processor products that allow IoT designers to scale both hardware and software to meet these design goals (flexible, scalable, and reliable). Many of these processor families also have integrated GPUs which offers extra processing resources.

There are four main product families:

Intel® Quark™ processor
Intel® Core™ processor
Intel Atom® processor
Intel® Xeon® processor

Image may be NSFW.
Clik here to view.
Figure 1 Designing to Scale

Early Big Data and Current IoT Architectures

Early big data architectures were based on sensors with networking capabilities. These accessed the Internet and transmitted data into cloud applications for later retrieval and analysis.

Current IoT architectures evolved into networks that either forward data in near real-time (to generate event-based responses) or function as sensor-actor networks.

About Sensors

Sensors are devices that detect or measure a physical property (temperature, humidity, light, etc.). Controllers receive input from sensors and initiate actions. These actions usually include using an actor or actuator to adjust or maintain desired outputs of specific processes. Let's consider, for example, a plant watering system based on sensors. The moisture sensor measures the water saturation of soil, and if that level falls below a certain threshold, a controller initiates an action to open a water valve.

Image may be NSFW.
Clik here to view. Evolution of IoT Networks
Figure 2 Evolution of IoT Networks

Evolving IoT Networks

Figure 2 illustrates how latency becomes a significant issue when the IoT network becomes more complex (specifically, the Sensor-Actor Real-Time Data Model). IoT designs must take into account two things: 1) a rapidly progressing network of sensors and 2) systems acting upon the network. Integrating fog architectures into existing IoT networks helps to reduce latency issues, bringing cloud elements closer to the edge!

As IoT evolves and more sophisticated applications are designed, the entire end-to-end IoT chain will need even more computing resources, while still requiring power consumption optimization. This need for processing power is constantly growing, meaning that designers need to account for some extra headroom for future software upgrades.

Migrating the cloud elements to the edge network or to the LAN (Figure 3) reduces network latencies accordingly. The real-time data path to the ON-PREMISES DATACENTER bypasses the access network, and results in the bandwidth and reliability benefits of LAN.

Image may be NSFW.
Clik here to view. Comparison of different types of IoT Networks
Figure 3 Comparison of different types of IoT Networks

IoT Network Stacks

An increase in network complexity, along with a growing demand for IoT, have resulted in the exponential growth of complex networks stacks. Now network stacks not only need to worry about IoT protocols, they also must account for security, encryption, and independent processors that handle additional tasks.

IoT Network Architecture

When planning architecture for an IoT network, it’s important to consider the downstream processing of the network. Let's consider, for example, a smart building where a sensor is linked to a lighting appliance. The appliance may be part of a larger building application. The smart building may also be part of a smart city network. In this case, you would want to consider that data is not only being passed locally, but also being transferred to a larger building network, and ultimately to a much larger city network.

Application Demands

As sensors grow in complexity and their implementation becomes widespread, it’s important to ensure processors account for additional demands (i.e., not only network connectivity). Increasingly large data sets are now communicating with sensors. Digital sensors that use GPIO or analog connections now have large volumes of data to run and manage in real-time. It’s important to scale independent microcontroller and bus interfaces in system designs to meet application demands. For example, Fog Node or edge computing will be needed as LIDAR, radar, ultrasound, and video(vision) sensors are added in order to keep up with real-time computing applications.

Autonomous Systems

Autonomous control and adaptive learning control systems should be accounted for in current or future IoT system designs. Implementation of autonomous systems is becoming more widespread. Being able to scale a design for future use is just as advantageous as offering the solution in your design as emerging technologies continue to progress. Smart homes, connected cars, artificial intelligence, and embedded deep learning are coming soon to the marketplace.

IoT Power and Performance with an Intel® Processor Family

Intel offers four families of processors that make achieving low latency, high bandwidth and increased reliability possible, all without increasing power consumption or affecting performance. The Internet of Things is a fast-growing, and complex system with many design considerations, such as latency issues, or ISP bottlenecks. These both can be rectified with with Intel® processors. Moving big data computing to the edge (and within LAN Fog Nodes) increases onsite computing resources, sensor capability, frees up bandwidth and increases reliability in IoT networks.

More on Scaling Processors at the Edge Edge-to-Cloud Integration Sensors

↧

The Evil within the Comparison Functions

May 19, 2017, 11:48 am

Latest and popular articles on Intel Technologies

≫ Next: Accelerating Deep Learning Inference with Intel® Processor Graphics

≪ Previous: Designing Scalable IoT Architectures

Perhaps, readers remember my article titled "Last line effect". It describes a pattern I've once noticed: in most cases programmers make an error in the last line of similar text blocks. Now I want to tell you about a new interesting observation. It turns out that programmers tend to make mistakes in functions comparing two objects. This statement looks implausible; however, I'll show you a great number of examples of errors that may be shocking to a reader. So, here is a new research, it will be quite amusing and scary.

Problematics

Here is my statement: programmers quite often make mistakes in rather simple functions that are meant to compare two objects. This claim is based on the experience of our team in checking a large number of open source projects in C, C++ and C#.

The functions we are going to consider here are IsEqual, Equals, Compare, AreEqual and so on or overloaded operators as ==, !=.

I noticed that when writing articles, very often I come across errors related to the comparison functions. I decided to explore this question in detail and examined the base of errors we found. I did a search of functions throughout the base containing words Cmp, Equal, Compare and such. The result was very impressive and shocking.

In fact this story is similar to the one we had when writing the article "Last line effect". Similarly, I noticed an anomaly and decided to explore it more carefully. Unfortunately, unlike the aforementioned article, I don't know how to bring statistics here and which figures to provide. Perhaps, later I'll come up with a solution with the statistics. At this point I am guided by intuition and can only share my feelings. They see that there are a lot of errors in the comparison functions and I am sure, you will get the same feeling when you see that huge amount of truly impressive examples.

Psychology

For a moment let's go back to the article "Last line effect". By the way, if you haven't read it, I suggest taking a break and looking at it. There is a more detailed analysis of this topic: "The last line effect explained"

In general, we can conclude that the cause of the errors in the last lined is related to the fact that the developer has already mentally moved to the new lines/tasks instead of focusing on the completion of the current fragment. As a result - when writing similar blocks of text, there is a higher probability that a programmer will make an error in the last one.

I believe that in the case of writing a comparison function, a developer in general often don't focus on it, considering it to be too trivial. In other words, he writes the code automatically, without thinking over it. Otherwise, it is not clear how one can make an error like this:

bool IsLuidsEqual(LUID luid1, LUID luid2)
{
  return (luid1.LowPart == luid2.LowPart) &&
         (luid2.HighPart == luid2.HighPart);
}

PVS-Studio analyzer detected this error in the code of RunAsAdmin Explorer Shim (C++) project: V501 There are identical sub-expressions to the left and to the right of the '==' operator: luid2.HighPart == luid2.HighPart RAACommon raacommonfuncs.cpp 1511

A typo. In the second line it should be: luid1.HighPart == luid2.HighPart.

The code is very simple. Apparently, the simplicity of code spoils everything. A programmer immediately thinks of the task to write such a function as standard and uninteresting. He instantly thinks of the way to write the function and he has just to implement the code. This is a routine, but unfortunately an inevitable process to start writing more important, complex and interesting code. He is already thinking about the new task... and as a result - makes an error.

In addition, programmers rarely write unit tests for such functions. Again the simplicity of these functions prevents from it. It seems that it would be too much to test them, as these functions are simple and repetitive. A person has written hundreds of such functions in his life, can he make an error in another function? Yes, he can and he does.

I would also like to note that we aren't talking about code of students who are just learning to program. We are talking about bugs in the code of such projects as GCC, Qt, GDB, LibreOffice, Unreal Engine, CryEngine 4 V Chromium, MongoDB, Oracle VM Virtual Box, FreeBSD, WinMerge, the CoreCLR, MySQL, Mono, CoreFX, Roslyn, MSBuild, etc. It's all very serious.

We are going to have a look at so many diverse examples that it would be scary to sleep at night.

Erroneous Patterns in Comparison Functions

All errors in comparison functions will be divided into several patterns. In the article we'll be talking about errors in projects in C, C++ and C#, but it makes no sense to separate these languages, as most of the patterns are similar for different languages.

Pattern: A < B, B > A

Very often in the comparison functions there is a need to make such checks:

A < B
A > B

Sometimes programmers think that is more elegant to use the same operator <, but to switch the variables.

A < B
B < A

However, due to the inattentiveness, we get such checks:

A < B
B > A

In fact, one and the same comparison is done twice here. Perhaps, it's not clear what it is about here, but we'll get to the practical examples and it'll all become clearer.

string _server;
....
bool operator<( const ServerAndQuery& other ) const {
  if ( ! _orderObject.isEmpty() )
    return _orderObject.woCompare( other._orderObject ) < 0;

  if ( _server < other._server )
    return true;
  if ( other._server > _server )
    return false;
  return _extra.woCompare( other._extra ) < 0;
}

PVS-Studio analyzer detected this error in the code of MongoDB (C++): V581 The conditional expressions of the 'if' operators situated alongside each other are identical. Check lines: 44, 46. parallel.h 46

This condition:

if ( other._server > _server )

Will always be false, as the same check was done two lines before. Correct code variant:

if ( _server < other._server )
  return true;
if ( other._server < _server )
  return false;

This error was detected in the code of Chromium project (C++):

enum ContentSettingsType;
struct EntryMapKey {
  ContentSettingsType content_type;
  ...
};

bool OriginIdentifierValueMap::EntryMapKey::operator<(
    const OriginIdentifierValueMap::EntryMapKey& other) const {
  if (content_type < other.content_type)
    return true;
  else if (other.content_type > content_type)
    return false;
  return (resource_identifier < other.resource_identifier);
}

PVS-Studio warning: V517 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 61, 63. browser content_settings_origin_identifier_value_map.cc 61

That was a C++ example, now it's C# turn. The next error was found in the code of IronPython and IronRuby (C#).

public static int Compare(SourceLocation left,
                          SourceLocation right) {
  if (left < right) return -1;
  if (right > left) return 1;
  return 0;
}

PVS-Studio warning (C#): V3021 There are two 'if' statements with identical conditional expressions. The first 'if' statement contains method return. This means that the second 'if' statement is senseless. SourceLocation.cs 156

I think there is no need in explanation.

Note. For C# there was just one example of an error, but for C++ - two. In general, there will be less bugs in the C# code, than for C/C++. But I do not recommend rushing to the conclusion that C# is much safer. The thing is that PVS-Studio analyzer has only recently learned to check C# code relatively recently, and we have just checked less projects written in C#, than in C and C++.

Pattern: a Member of the Class is Compared with itself

The comparison functions usually consist of successive comparisons of structure/class members. This code tends to be more erronreous, when the member of the class starts being compared with itself. I can specify two subtypes of errors.

In the first case, a programmer forgets to specify the name of the object and writes in the following way:

return m_x == foo.m_x &&
       m_y == m_y &&            // <=
       m_z == foo.m_z;
In the second case, the same name of the object is written.
return zzz.m_x == foo.m_x &&
       zzz.m_y == zzz.m_y &&    // <=
       zzz.m_z == foo.m_z;

Let's take a closer look at practical examples of this pattern. Pay attention that incorrect comparison often occurs in the last block of similar code blocks, which reminds us of the "last line effect" again.

The error is found in the code of Unreal Engine 4 (C++) project:

bool
Compare(const FPooledRenderTargetDesc& rhs, bool bExact) const
{
  ....
  return Extent == rhs.Extent&& Depth == rhs.Depth&& bIsArray == rhs.bIsArray&& ArraySize == rhs.ArraySize&& NumMips == rhs.NumMips&& NumSamples == rhs.NumSamples&& Format == rhs.Format&& LhsFlags == RhsFlags&& TargetableFlags == rhs.TargetableFlags&& bForceSeparateTargetAndShaderResource ==
         rhs.bForceSeparateTargetAndShaderResource&& ClearValue == rhs.ClearValue&& AutoWritable == AutoWritable;           // <=
}

PVS-Studio warning: V501 There are identical sub-expressions to the left and to the right of the '==' operator: AutoWritable == AutoWritable rendererinterface.h 180

The code of Samba (C) project:

static int compare_procids(const void *p1, const void *p2)
{
  const struct server_id *i1 = (struct server_id *)p1;
  const struct server_id *i2 = (struct server_id *)p2;

  if (i1->pid < i2->pid) return -1;
  if (i2->pid > i2->pid) return 1;
  return 0;
}

PVS-Studio warning: V501 There are identical sub-expressions to the left and to the right of the '>' operator: i2->pid > i2->pid brlock.c 1901

The code of MongoDB (C++) project:

bool operator==(const MemberCfg& r) const {
  ....
  return _id==r._id && votes == r.votes &&
         h == r.h && priority == r.priority &&
         arbiterOnly == r.arbiterOnly &&
         slaveDelay == r.slaveDelay &&
         hidden == r.hidden &&
         buildIndexes == buildIndexes;        // <=
}

PVS-Studio warning: V501 There are identical sub-expressions to the left and to the right of the '==' operator: buildIndexes == buildIndexes rs_config.h 101

The code of Geant4 Software (C++) project:

inline G4bool G4FermiIntegerPartition::
operator==(const G4FermiIntegerPartition& right)
{
  return (total == right.total &&
          enableNull == enableNull &&          // <=
          partition == right.partition);
}

PVS-Studio warning: V501 There are identical sub-expressions to the left and to the right of the '==' operator: enableNull == enableNull G4hadronic_deex_fermi_breakup g4fermiintegerpartition.icc 58

The code of LibreOffice (C++) project:

class SvgGradientEntry
{
  ....
  bool operator==(const SvgGradientEntry& rCompare) const
  {
    return (getOffset() == rCompare.getOffset()&& getColor() == getColor()            // <=&& getOpacity() == getOpacity());      // <=
  }
  ....
}

PVS-Studio warning: V501 There are identical sub-expressions to the left and to the right of the '==' operator: getColor() == getColor() svggradientprimitive2d.hxx 61

The code of Chromium (C++) project:

bool FileIOTest::MatchesResult(const TestStep& a,
                               const TestStep& b) {
  ....
  return (a.data_size == a.data_size &&             // <=
          std::equal(a.data, a.data + a.data_size, b.data));
}

PVS-Studio warning: V501 There are identical sub-expressions to the left and to the right of the '==' operator: a.data_size == a.data_size cdm_file_io_test.cc 367

The code of FreeCAD (C++) project:

bool FaceTypedBSpline::isEqual(const TopoDS_Face &faceOne,
                               const TopoDS_Face &faceTwo) const
{
  ....
  if (surfaceOne->IsURational() !=
      surfaceTwo->IsURational())
    return false;
  if (surfaceTwo->IsVRational() !=         // <=
      surfaceTwo->IsVRational())           // <=
    return false;
  if (surfaceOne->IsUPeriodic() !=
      surfaceTwo->IsUPeriodic())
    return false;
  if (surfaceOne->IsVPeriodic() !=
      surfaceTwo->IsVPeriodic())
    return false;
  if (surfaceOne->IsUClosed() !=
      surfaceTwo->IsUClosed())
    return false;
  if (surfaceOne->IsVClosed() !=
      surfaceTwo->IsVClosed())
    return false;
  if (surfaceOne->UDegree() !=
      surfaceTwo->UDegree())
    return false;
  if (surfaceOne->VDegree() !=
      surfaceTwo->VDegree())
    return false;
  ....
}

PVS-Studio warning: V501 There are identical sub-expressions 'surfaceTwo->IsVRational()' to the left and to the right of the '!=' operator. modelrefine.cpp 780

The code of Serious Engine (C++) project:

class CTexParams {
public:

  inline BOOL IsEqual( CTexParams tp) {
    return tp_iFilter     == tp.tp_iFilter &&
           tp_iAnisotropy == tp_iAnisotropy &&             // <=
           tp_eWrapU      == tp.tp_eWrapU &&
           tp_eWrapV      == tp.tp_eWrapV; };
  ....
};

PVS-Studio warning: V501 There are identical sub-expressions to the left and to the right of the '==' operator: tp_iAnisotropy == tp_iAnisotropy gfx_wrapper.h 180

The code of Qt (C++) project:

inline bool qCompare(QImage const &t1, QImage const &t2, ....)
{
  ....
  if (t1.width() != t2.width() || t2.height() != t2.height()) {
  ....
}

PVS-Studio warning: V501 There are identical sub-expressions to the left and to the right of the '!=' operator: t2.height() != t2.height() qtest_gui.h 101

The code of FreeBSD (C) project:

static int
compare_sh(const void *_a, const void *_b)
{
  const struct ipfw_sopt_handler *a, *b;

  a = (const struct ipfw_sopt_handler *)_a;
  b = (const struct ipfw_sopt_handler *)_b;
  ....
  if ((uintptr_t)a->handler < (uintptr_t)b->handler)
    return (-1);
  else if ((uintptr_t)b->handler > (uintptr_t)b->handler) // <=
    return (1);

  return (0);
}

PVS-Studio warning: V501 There are identical sub-expressions '(uintptr_t) b->handler' to the left and to the right of the '>' operator. ip_fw_sockopt.c 2893

The code of Mono (C#) project:

static bool AreEqual (VisualStyleElement value1,
                      VisualStyleElement value2)
{
  return
    value1.ClassName == value1.ClassName && // <=
    value1.Part == value2.Part &&
    value1.State == value2.State;
}

PVS-Studio warning: V3001 There are identical sub-expressions 'value1.ClassName' to the left and to the right of the '==' operator. ThemeVisualStyles.cs 2141

The code of Mono (C#) project:

public int ExactInference (TypeSpec u, TypeSpec v)
{
  ....
  var ac_u = (ArrayContainer) u;
  var ac_v = (ArrayContainer) v;
  ....
  var ga_u = u.TypeArguments;
  var ga_v = v.TypeArguments;
  ....
  if (u.TypeArguments.Length != u.TypeArguments.Length) // <=
    return 0;

  ....
}

PVS-Studio warning: V3001 There are identical sub-expressions 'u.TypeArguments.Length' to the left and to the right of the '!=' operator. generic.cs 3135

The code of MonoDevelop (C#) project:

Accessibility DeclaredAccessibility { get; }
bool IsStatic { get; }

private bool MembersMatch(ISymbol member1, ISymbol member2)
{
  if (member1.Kind != member2.Kind)
  {
    return false;
  }

  if (member1.DeclaredAccessibility !=          // <=1
      member1.DeclaredAccessibility             // <=1
   || member1.IsStatic != member1.IsStatic)     // <=2
  {
    return false;
  }

  if (member1.ExplicitInterfaceImplementations().Any() ||
      member2.ExplicitInterfaceImplementations().Any())
  {
    return false;
  }

  return SignatureComparer
    .HaveSameSignatureAndConstraintsAndReturnTypeAndAccessors(
       member1, member2, this.IsCaseSensitive);
}

PVS-Studio warning: V3001 There are identical sub-expressions 'member1.IsStatic' to the left and to the right of the '!=' operator. CSharpBinding AbstractImplementInterfaceService.CodeAction.cs 545

The code of Haiku (C++) project:

int __CORTEX_NAMESPACE__ compareTypeAndID(....)
{
  int retValue = 0;
  ....
  if (lJack && rJack)
  {
    if (lJack->m_jackType < lJack->m_jackType)           // <=
    {
      return -1;
    }
    if (lJack->m_jackType == lJack->m_jackType)          // <=
    {
      if (lJack->m_index < rJack->m_index)
      {
        return -1;
      }
      else
      {
        return 1;
      }
    }
    else if (lJack->m_jackType > rJack->m_jackType)
    {
      retValue = 1;
    }
  }
  return retValue;
}

PVS-Studio warning: V501 There are identical sub-expressions to the left and to the right of the '<' operator: lJack->m_jackType < lJack->m_jackType MediaJack.cpp 783

Just below there is exactly the same error. As I understand, in both cases a programmer forgot to replace lJack with rJack.

The code of CryEngine V (C++) project:

bool
CompareRotation(const Quat& q1, const Quat& q2, float epsilon)
{
  return (fabs_tpl(q1.v.x - q2.v.x) <= epsilon)&& (fabs_tpl(q1.v.y - q2.v.y) <= epsilon)&& (fabs_tpl(q2.v.z - q2.v.z) <= epsilon)     // <=&& (fabs_tpl(q1.w - q2.w) <= epsilon);
}

PVS-Studio warning: V501 There are identical sub-expressions to the left and to the right of the '-' operator: q2.v.z - q2.v.z entitynode.cpp 93

Pattern: Evaluating the Size of a Pointer Instead of the Size of the Structure/Class

This type of error occurs in programs written in C and C++ and is caused by incorrect use of the sizeof operator. The error in evaluating not the size of the object, but the size of the pointer. Example:

T *a = foo1();
T *b = foo2();
x = memcmp(a, b, sizeof(a));

Instead of the size of the T structure, a size of the pointer gets evaluated. The size of the pointer depends on the used data model, but usually it is 4 or 8. As a result, more or less bites in the memory get compared than take the structure.

Correct variant of the code:

x = memcmp(a, b, sizeof(T));

x = memcmp(a, b, sizeof(*a));

Now let's move on to the practical part. Here is how such a bug looks in the code of CryEngine V (C++) code:

bool
operator==(const SComputePipelineStateDescription& other) const
{
  return 0 == memcmp(this, &other, sizeof(this));
}

PVS-Studio warning: V579 The memcmp function receives the pointer and its size as arguments. It is possibly a mistake. Inspect the third argument. graphicspipelinestateset.h 58

The code of Unreal Engine 4 project (C++):

bool FRecastQueryFilter::IsEqual(
  const INavigationQueryFilterInterface* Other) const
{
  // @NOTE: not type safe, should be changed when
  // another filter type is introduced
  return FMemory::Memcmp(this, Other, sizeof(this)) == 0;

}

PVS-Studio warning: V579 The Memcmp function receives the pointer and its size as arguments. It is possibly a mistake. Inspect the third argument. pimplrecastnavmesh.cpp 172

Pattern: Repetitive Arguments of Cmp(A, A) Type

Comparison functions usually call other comparison functions. At the same time one of the possible errors is that the reference/pointer is passed to the same object twice. Example:

x = memcmp(A, A, sizeof(T));

Here the object A will be compared with itself, which, is of course, has no sense.

We'll start with an error, found in the debugger GDB (C):

static int
psymbol_compare (const void *addr1, const void *addr2,
                 int length)
{
  struct partial_symbol *sym1 = (struct partial_symbol *) addr1;
  struct partial_symbol *sym2 = (struct partial_symbol *) addr2;

  return (memcmp (&sym1->ginfo.value, &sym1->ginfo.value,    // <=
                  sizeof (sym1->ginfo.value)) == 0&& sym1->ginfo.language == sym2->ginfo.language&& PSYMBOL_DOMAIN (sym1) == PSYMBOL_DOMAIN (sym2)&& PSYMBOL_CLASS (sym1) == PSYMBOL_CLASS (sym2)&& sym1->ginfo.name == sym2->ginfo.name);
}

PVS-Studio warning: V549 The first argument of 'memcmp' function is equal to the second argument. psymtab.c 1580

The code of CryEngineSDK project (C++):

inline bool operator != (const SEfResTexture &m) const
{
  if (stricmp(m_Name.c_str(), m_Name.c_str()) != 0 ||   // <=
      m_TexFlags != m.m_TexFlags ||
      m_bUTile != m.m_bUTile ||
      m_bVTile != m.m_bVTile ||
      m_Filter != m.m_Filter ||
      m_Ext != m.m_Ext ||
      m_Sampler != m.m_Sampler)
    return true;
  return false;
}

PVS-Studio warning: V549 The first argument of 'stricmp' function is equal to the second argument. ishader.h 2089

The code of PascalABC.NET (C#):

private List<string> enum_consts = new List<string>();
public override bool IsEqual(SymScope ts)
{
  EnumScope es = ts as EnumScope;
  if (es == null) return false;
  if (enum_consts.Count != es.enum_consts.Count) return false;
  for (int i = 0; i < es.enum_consts.Count; i++)
    if (string.Compare(enum_consts[i],
                       this.enum_consts[i], true) != 0)
      return false;
  return true;
}

PVS-Studio warning: V3038 The 'enum_consts[i]' argument was passed to 'Compare' method several times. It is possible that other argument should be passed instead. CodeCompletion SymTable.cs 2206

I'll give some explanation here. The error in the factual arguments of the Compare function:

string.Compare(enum_consts[i], this.enum_consts[i], true)

The thing is that enum_consts[i] and this.enum_consts[i are the same things. As I understand, a correct call should be like this:

string.Compare(es.enum_consts[i], this.enum_consts[i], true)

string.Compare(enum_consts[i], es.enum_consts[i], true)

Pattern: Repetitive Checks A==B && A==B

Quite a common error in programming is when the same check is done twice. Example:

return A == B &&
       C == D &&   // <=
       C == D &&   // <=
       E == F;

Two variants are possible in this case. The first is quite harmless: one comparison is redundant and can be simply removed. The second is worse: some other variables were to be compared, but a programmer made a typo.

In any case, such code deserves close attention. Let me scare you a little more, and show that this error can be found even in the code of GCC compiler (C):

static bool
dw_val_equal_p (dw_val_node *a, dw_val_node *b)
{
  ....
  case dw_val_class_vms_delta:
    return (!strcmp (a->v.val_vms_delta.lbl1,
                     b->v.val_vms_delta.lbl1)&& !strcmp (a->v.val_vms_delta.lbl1,
                        b->v.val_vms_delta.lbl1));
  ....
}

PVS-Studio warning: V501 There are identical sub-expressions '!strcmp(a->v.val_vms_delta.lbl1, b->v.val_vms_delta.lbl1)' to the left and to the right of the '&&' operator. dwarf2out.c 1428

The function strcmp is called twice with the same set of arguments.

The code of Unreal Engine 4 project (C++):

FORCEINLINE
bool operator==(const FShapedGlyphEntryKey& Other) const
{
  return FontFace == Other.FontFace&& GlyphIndex == Other.GlyphIndex   // <=&& FontSize == Other.FontSize&& FontScale == Other.FontScale&& GlyphIndex == Other.GlyphIndex;  // <=
}

PVS-Studio warning: V501 There are identical sub-expressions 'GlyphIndex == Other.GlyphIndex' to the left and to the right of the '&&' operator. fontcache.h 139

The code of Serious Engine project (C++):

inline BOOL CValuesForPrimitive::operator==(....)
{
  return (
 (....) &&
 (vfp_ptPrimitiveType == vfpToCompare.vfp_ptPrimitiveType) &&
 ....
 (vfp_ptPrimitiveType == vfpToCompare.vfp_ptPrimitiveType) &&
 ....
);

PVS-Studio warning: V501 There are identical sub-expressions '(vfp_ptPrimitiveType == vfpToCompare.vfp_ptPrimitiveType)' to the left and to the right of the '&&' operator. worldeditor.h 580

The code of Oracle VM Virtual Box project (C++):

typedef struct SCMDIFFSTATE
{
  ....
  bool  fIgnoreTrailingWhite;
  bool  fIgnoreLeadingWhite;
  ....
} SCMDIFFSTATE;
/* Pointer to a diff state. */

typedef SCMDIFFSTATE *PSCMDIFFSTATE;

/* Compare two lines */
DECLINLINE(bool) scmDiffCompare(PSCMDIFFSTATE pState, ....)
{
  ....
  if (pState->fIgnoreTrailingWhite    // <=
   || pState->fIgnoreTrailingWhite)   // <=
    return scmDiffCompareSlow(....);
  ....
}

PVS-Studio warning: V501 There are identical sub-expressions 'pState->fIgnoreTrailingWhite' to the left and to the right of the '||' operator. scmdiff.cpp 238

Pattern: Incorrect Use of the Value, Returned by memcmp Function

The memcmp function returns the following values of int type:

< 0 - buf1 less than buf2;
0 - buf1 identical to buf2;
> 0 - buf1 greater than buf2;

Please note that '>0' can be any number, not only 1. These numbers can be: 2, 3, 100, 256, 1024, 5555, 65536 and so on. This means that this result cannot be placed to a variable of the char and short type. The high bits can be lost, which might violate the logic of program execution.

Also this means that the result cannot be compared with constants 1 or -1. In other words, it is wrong to write this:

if (memcmp(a, b, sizeof(T)) == 1)
if (memcmp(x, y, sizeof(T)) == -1)

Correct comparisons:

if (memcmp(a, b, sizeof(T)) > 0)
if (memcmp(a, b, sizeof(T)) < 0)

The danger of this code is that it may successfully work for a long time. The errors may start showing up when moving to a new platform or with the change of the compiler version.

The code of ReactOS project (C++):

HRESULT WINAPI CRecycleBin::CompareIDs(....)
{
  ....
  return MAKE_HRESULT(SEVERITY_SUCCESS, 0,
   (unsigned short)memcmp(pidl1->mkid.abID,
                          pidl2->mkid.abID,
                          pidl1->mkid.cb));
}

PVS-Studio warning: V642 Saving the 'memcmp' function result inside the 'unsigned short' type variable is inappropriate. The significant bits could be lost breaking the program's logic. recyclebin.cpp 542

The code of Firebird project (C++):

SSHORT TextType::compare(ULONG len1, const UCHAR* str1,
ULONG len2, const UCHAR* str2)
{
  ....
  SSHORT cmp = memcmp(str1, str2, MIN(len1, len2));

  if (cmp == 0)
    cmp = (len1 < len2 ? -1 : (len1 > len2 ? 1 : 0));
  return cmp;
}

PVS-Studio warning: V642 Saving the 'memcmp' function result inside the 'short' type variable is inappropriate. The significant bits could be lost breaking the program's logic. texttype.cpp 338

The code of CoreCLR project (C++):

bool operator( )(const GUID& _Key1, const GUID& _Key2) const
  { return memcmp(&_Key1, &_Key2, sizeof(GUID)) == -1; }

PVS-Studio warning: V698 Expression 'memcmp(....) == -1' is incorrect. This function can return not only the value '-1', but any negative value. Consider using 'memcmp(....) < 0' instead. sos util.cpp 142

The code of OpenToonz project (C++):

bool TFilePath::operator<(const TFilePath &fp) const
{
  ....
  char differ;
  differ = _wcsicmp(iName.c_str(), jName.c_str());
  if (differ != 0)
    return differ < 0 ? true : false;
  ....
}

PVS-Studio warning: V642 Saving the '_wcsicmp' function result inside the 'char' type variable is inappropriate. The significant bits could be lost, breaking the program's logic. tfilepath.cpp 328

Pattern: Incorrect Check of Null References

This error pattern is typical for C# programs. Sometimes in the comparison functions programmers write the type casting with the help of the as operator. The error is that inadvertently a programmer verifies against null not the new reference, but the original one. Let's take a look at a synthetic example:

ChildT foo = obj as ChildT;
if (obj == null)
  return false;
if (foo.zzz()) {}

The check if (obj == null) protects from the situation, if the obj variable contains a null reference. However, there is no protection from the case if it turns out that the as operator returns a null reference. The correct code should be like this:

ChildT foo = obj as ChildT;
if (foo == null)
  return false;
if (foo.zzz()) {}

Typically, this error occurs due to negligence of the programmer. Similar bugs are possible in the programs in C and C++, but I haven't found such a case in our error base.

The code of MonoDevelop project (C#):

public override bool Equals (object o)
{
  SolutionItemReference sr = o as SolutionItemReference;
  if (o == null)
    return false;
  return (path == sr.path) && (id == sr.id);
}

PVS-Studio warning: V3019 Possibly an incorrect variable is compared to null after type conversion using 'as' keyword. Check variables 'o', 'sr'. MonoDevelop.Core SolutionItemReference.cs 81

The code of CoreFX (C#):

public override bool Equals(object comparand)
{
  CredentialHostKey comparedCredentialKey =
                                  comparand as CredentialHostKey;

  if (comparand == null)
  {
    // This covers also the compared == null case
    return false;
  }

  bool equals = string.Equals(AuthenticationType,
        comparedCredentialKey.AuthenticationType, ....
  ....
}

PVS-Studio warning: V3019 Possibly an incorrect variable is compared to null after type conversion using 'as' keyword. Check variables 'comparand', 'comparedCredentialKey'. CredentialCache.cs 4007

The code of Roslyn project (C#):

public override bool Equals(object obj)
{
  var d = obj as DiagnosticDescription;

  if (obj == null)
    return false;

  if (!_code.Equals(d._code))
    return false;
  ....
}

PVS-Studio warning: V3019 Possibly an incorrect variable is compared to null after type conversion using 'as' keyword. Check variables 'obj', 'd'. DiagnosticDescription.cs 201

The code of Roslyn (C#):

protected override bool AreEqual(object other)
{
  var otherResourceString = other as LocalizableResourceString;
  return
    other != null &&
    _nameOfLocalizableResource ==
      otherResourceString._nameOfLocalizableResource &&
    _resourceManager == otherResourceString._resourceManager &&
    _resourceSource == otherResourceString._resourceSource &&
    ....
}

PVS-Studio warning: V3019 Possibly an incorrect variable is compared to null after type conversion using 'as' keyword. Check variables 'other', 'otherResourceString'. LocalizableResourceString.cs 121

The code of MSBuild project (C#):

public override bool Equals(object obj)
{
   AssemblyNameExtension name = obj as AssemblyNameExtension;
   if (obj == null)  // <=
   {
     return false;
   }
   ....
}

PVS-Studio warning: V3019 Possibly an incorrect variable is compared to null after type conversion using 'as' keyword. Check variables 'obj', 'name'. AssemblyRemapping.cs 64

The code of Mono project (C#):

public override bool Equals (object o)
{
  UrlMembershipCondition umc = (o as UrlMembershipCondition);
  if (o == null)                                      // <=
    return false;

  ....

  return (String.Compare (u, 0, umc.Url, ....) == 0); // <=
}

PVS-Studio warning: V3019 Possibly an incorrect variable is compared to null after type conversion using 'as' keyword. Check variables 'o', 'umc'. UrlMembershipCondition.cs 111

The code of Media Portal 2 project (C#):

public override bool Equals(object obj)
{
  EpisodeInfo other = obj as EpisodeInfo;
  if (obj == null) return false;
  if (TvdbId > 0 && other.TvdbId > 0)
    return TvdbId == other.TvdbId;
  ....
}

PVS-Studio warning: V3019 Possibly an incorrect variable is compared to null after type conversion using 'as' keyword. Check variables 'obj', 'other'. EpisodeInfo.cs 560

The code of NASA World Wind project (C#):

public int CompareTo(object obj)
{
  RenderableObject robj = obj as RenderableObject;
  if(obj == null)                                 // <=
    return 1;
  return this.m_renderPriority.CompareTo(robj.RenderPriority);
}

PVS-Studio warning: V3019 Possibly an incorrect variable is compared to null after type conversion using 'as' keyword. Check variables 'obj', 'robj'. RenderableObject.cs 199

Pattern: Incorrect Loops

In some functions, collections of items are compared. Of course, different variant of the loops are used for its comparison. If a programmer writes the code inattentively, it's easy to mix something up, as it is with the comparison functions. Let's look at a few of these situations.

The code of Trans-Proteomic Pipeline (C++):

bool Peptide::operator==(Peptide& p) {
  ....
  for (i = 0, j = 0;
       i < this->stripped.length(), j < p.stripped.length();
       i++, j++) {
  ....
}

PVS-Studio warning: V521 Such expressions using the ',' operator are dangerous. Make sure the expression is correct. tpplib peptide.cpp 191

Note that the comma operator is used in the condition. The code is clearly incorrect, because the condition, written to the left of the coma is ignored. That is, the condition on the left is evaluated, but its result is not used in any way.

The code of Qt project (C++):

bool equals( class1* val1, class2* val2 ) const
{
  ...
  size_t size = val1->size();
  ...
  while ( --size >= 0 ){
    if ( !comp(*itr1,*itr2) )
      return false;
    itr1++;
    itr2++;
  }
  ...
}

PVS-Studio warning: V547 Expression '-- size >= 0' is always true. Unsigned type value is always >= 0. QtCLucene arrays.h 154

The code of CLucene project (C++):

class Arrays
{
  ....
   bool equals( class1* val1, class2* val2 ) const{
     static _comparator comp;
     if ( val1 == val2 )
       return true;
     size_t size = val1->size();
     if ( size != val2->size() )
       return false;
     _itr1 itr1 = val1->begin();
     _itr2 itr2 = val2->begin();
     while ( --size >= 0 ){
       if ( !comp(*itr1,*itr2) )
         return false;
       itr1++;
       itr2++;
     }
   return true;
  }
  ....
}

PVS-Studio warning: V547 Expression '-- size >= 0' is always true. Unsigned type value is always >= 0. arrays.h 154

The code of Mono project (C#):

public override bool Equals (object obj)
{
  ....
  for (int i=0; i < list.Count; i++) {
    bool found = false;
    for (int j=0; i < ps.list.Count; j++) {     // <=
      if (list [i].Equals (ps.list [j])) {
        found = true;
        break;
      }
    }
    if (!found)
      return false;
  }
  return true;
}

PVS-Studio warning: V3015 It is likely that a wrong variable is being compared inside the 'for' operator. Consider reviewing 'i' corlib-net_4_x PermissionSet.cs 607

Apparently, there is a typo here, and the variable j instead of i should be used in the nested loop:

for (int j=0; j < ps.list.Count; j++)

Pattern: A = getA(), B = GetA()

Quite often in the comparison functions a programmer has to write code of this kind:

if (GetA().x == GetB().x && GetA().y == GetB().y)

Intermediate variables are used to reduce the size of the conditions or for optimization:

Type A = GetA();
Type B = GetB();
if (A.x == B.x && A.y == B.y)

But inadvertently, a person sometimes makes a mistake and initializes temporary variables with the same value:

Type A = GetA();
Type B = GetA();

Now let's take a look at these errors in the code of real applications.

The code of LibreOffice project (C++):

bool CmpAttr(
  const SfxPoolItem& rItem1, const SfxPoolItem& rItem2)
{
  ....
  bool bNumOffsetEqual = false;
  ::boost::optional<sal_uInt16> oNumOffset1 =
        static_cast<const SwFmtPageDesc&>(rItem1).GetNumOffset();
  ::boost::optional<sal_uInt16> oNumOffset2 =
        static_cast<const SwFmtPageDesc&>(rItem1).GetNumOffset();

  if (!oNumOffset1 && !oNumOffset2)
  {
    bNumOffsetEqual = true;
  }
  else if (oNumOffset1 && oNumOffset2)
  {
    bNumOffsetEqual = oNumOffset1.get() == oNumOffset2.get();
  }
  else
  {
    bNumOffsetEqual = false;
  }
  ....
}

PVS-Studio warning: V656 Variables 'oNumOffset1', 'oNumOffset2' are initialized through the call to the same function. It's probably an error or un-optimized code. Check lines: 68, 69. findattr.cxx 69

The code of Qt project (C++):

AtomicComparator::ComparisonResult
IntegerComparator::compare(const Item &o1,
                           const AtomicComparator::Operator,
                           const Item &o2) const
{
  const Numeric *const num1 = o1.as<Numeric>();
  const Numeric *const num2 = o1.as<Numeric>();

  if(num1->isSigned() || num2->isSigned())
  ....
}

PVS-Studio warning: V656 Variables 'num1', 'num2' are initialized through the call to the same function. It's probably an error or un-optimized code. Consider inspecting the 'o1.as < Numeric > ()' expression. Check lines: 220, 221. qatomiccomparators.cpp 221

Pattern: Sloppy Copying of the Code

A large amount of errors, cited previously can be called the consequences of sloppy Copy-Paste. They fell under some categories of the erroneous pattern and I decided that it would be logical to describe them in corresponding sections. However, I have several errors that have clearly appeared because of sloppy code copying, but I have no idea how to classify them. That's why I collected these errors here.

The code of CoreCLR project (C++):

int __cdecl Compiler::RefCntCmp(const void* op1, const void* op2)
{
  ....
  if (weight1)
  {
    ....
    if (varTypeIsGC(dsc1->TypeGet()))
    {
      weight1 += BB_UNITY_WEIGHT / 2;
    }
    if (dsc1->lvRegister)
    {
      weight1 += BB_UNITY_WEIGHT / 2;
    }
  }

  if (weight1)
  {
    ....
    if (varTypeIsGC(dsc2->TypeGet()))
    {
      weight1 += BB_UNITY_WEIGHT / 2;       // <=
    }
    if (dsc2->lvRegister)
    {
      weight2 += BB_UNITY_WEIGHT / 2;
    }
  }
  ....
}

PVS-Studio warning: V778 Two similar code fragments were found. Perhaps, this is a typo and 'weight2' variable should be used instead of 'weight1'. clrjit lclvars.cpp 2702

The function was long that's why it is shortened for the article. If we examine the code of the function, we'll see that a part of the code was copied, but in one fragment a programmer forgot to replace the variable weight1 with weight2.

The code of WPF samples by Microsoft project (C#):

public int Compare(GlyphRun a, GlyphRun b)
{
  ....
  if (aPoint.Y > bPoint.Y)      // <=
  {
    return -1;
  }
  else if (aPoint.Y > bPoint.Y) // <=
  {
    result = 1;
  }
  else if (aPoint.X < bPoint.X)
  {
    result = -1;
  }
  else if (aPoint.X > bPoint.X)
  {
    result = 1;
  }
  ....
}

PVS-Studio warning: V3003 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 418, 422. txtserializerwriter.cs 418

The code of PascalABC.NET project (C#):

public void CompareInternal(....)
{
  ....
  else if (left is int64_const)
    CompareInternal(left as int64_const, right as int64_const);
  ....
  else if (left is int64_const)
    CompareInternal(left as int64_const, right as int64_const);
  ....
}

PVS-Studio warning: V3003 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 597, 631. ParserTools SyntaxTreeComparer.cs 597

The code of SharpDevelop project (C#):

public int Compare(SharpTreeNode x, SharpTreeNode y)
{
  ....
  if (typeNameComparison == 0) {
    if (x.Text.ToString().Length < y.Text.ToString().Length)
      return -1;
    if (x.Text.ToString().Length < y.Text.ToString().Length)
      return 1;
  }
  ....
}

PVS-Studio warning: V3021 There are two 'if' statements with identical conditional expressions. The first 'if' statement contains method return. This means that the second 'if' statement is senseless NamespaceTreeNode.cs 87

The code of Coin3D (C++):

int
SbProfilingData::operator == (const SbProfilingData & rhs) const
{
  if (this->actionType != rhs.actionType) return FALSE;
  if (this->actionStartTime != rhs.actionStopTime) return FALSE;
  if (this->actionStartTime != rhs.actionStopTime) return FALSE;
  ....
}

PVS-Studio warning: V649 There are two 'if' statements with identical conditional expressions. The first 'if' statement contains function return. This means that the second 'if' statement is senseless. Check lines: 1205, 1206. sbprofilingdata.cpp 1206

The code of Spring (C++):

bool operator < (const aiFloatKey& o) const
  {return mTime < o.mTime;}
bool operator > (const aiFloatKey& o) const
  {return mTime < o.mTime;}

PVS-Studio warning: V524 It is odd that the body of '>' function is fully equivalent to the body of '<' function. assimp 3dshelper.h 470

And here is the last, particularly interesting code fragment that PVS-Studio analyzer found in MySQL project (C++).

static int rr_cmp(uchar *a,uchar *b)
{
  if (a[0] != b[0])
    return (int) a[0] - (int) b[0];
  if (a[1] != b[1])
    return (int) a[1] - (int) b[1];
  if (a[2] != b[2])
    return (int) a[2] - (int) b[2];
  if (a[3] != b[3])
    return (int) a[3] - (int) b[3];
  if (a[4] != b[4])
    return (int) a[4] - (int) b[4];
  if (a[5] != b[5])
    return (int) a[1] - (int) b[5]; // <=
  if (a[6] != b[6])
    return (int) a[6] - (int) b[6];
  return (int) a[7] - (int) b[7];
}

PVS-Studio warning: V525 The code containing the collection of similar blocks. Check items '0', '1', '2', '3', '4', '1', '6' in lines 680, 682, 684, 689, 691, 693, 695. sql records.cc 680

Most likely, a programmer wrote the first comparison, then the second and got bored. So he copied to the buffer a text block:

if (a[1] != b[1])
  return (int) a[1] - (int) b[1];

A pasted it to the text of the program as many times as he needed. Then he changed indexes, but made a mistake in one place and got an incorrect comparison:

if (a[5] != b[5])
  return (int) a[1] - (int) b[5];

Note. I discuss this error in more detail in my mini-book "The Ultimate Question of Programming, Refactoring, and Everything" (see a chapter "Don't do the compiler's job").

Pattern: Equals Method Incorrectly Processes a Null Reference

In C# the accepted practice is to implement the Equals methods in such a way, so that they correctly process a situation, if a null reference is passed as an argument. Unfortunately, not all the methods are implemented according to this rule.

The code of GitExtensions (C#):

public override bool Equals(object obj)
{
  return GetHashCode() == obj.GetHashCode(); // <=
}

PVS-Studio warning: V3115 Passing 'null' to 'Equals(object obj)' method should not result in 'NullReferenceException'. Git.hub Organization.cs 14

The code of PascalABC.NET project (C#):

public override bool Equals(object obj)
{
  var rhs = obj as ServiceReferenceMapFile;
  return FileName == rhs.FileName;
}

PVS-Studio warning: V3115 Passing 'null' to 'Equals' method should not result in 'NullReferenceException'. ICSharpCode.SharpDevelop ServiceReferenceMapFile.cs 31

Miscellaneous Errors

The code of G3D Content Pak project (C++):

bool Matrix4::operator==(const Matrix4& other) const {
  if (memcmp(this, &other, sizeof(Matrix4) == 0)) {
    return true;
  }
  ...
}

PVS-Studio warning: V575 The 'memcmp' function processes '0' elements. Inspect the 'third' argument. graphics3D matrix4.cpp 269

One closing bracket is put incorrectly. As a result, the amount of bites compared is evaluated by the statement sizeof(Matrix4) == 0. The size of any class is more than 0, which means that the result of the expression is 0. Thus, 0 bites get compared.

Correct variant:

if (memcmp(this, &other, sizeof(Matrix4)) == 0) {

The code of Wolfenstein 3D project (C++):

inline int operator!=( quat_t a, quat_t b )
{
  return ( ( a.x != b.x ) || ( a.y != b.y ) ||
           ( a.z != b.z ) && ( a.w != b.w ) );
}

PVS-Studio warning: V648 Priority of the '&&' operation is higher than that of the '||' operation. math_quaternion.h 167

Apparently, in one fragment the && operator was accidentally written instead of ||.

The code of FlightGear project (C):

static int tokMatch(struct Token* a, struct Token* b)
{
  int i, l = a->strlen;
  if(!a || !b) return 0;
  ....
}

PVS-Studio warning: V595 The 'a' pointer was utilized before it was verified against nullptr. Check lines: 478, 479. codegen.c 478

If we pass NULL as the first argument to the function, we'll get null pointer dereference, although the programmer wanted the function to return 0.

The code of WinMerge project (C++):

int TimeSizeCompare::CompareFiles(int compMethod,
                                  const DIFFITEM &di)
{
  UINT code = DIFFCODE::SAME;
  ...
  if (di.left.size != di.right.size)
  {
    code &= ~DIFFCODE::SAME;
    code = DIFFCODE::DIFF;
  }
  ...
}

PVS-Studio warning: V519 The 'code' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 79, 80. Merge timesizecompare.cpp 80

The code of ReactOS project (C++):

#define IsEqualGUID(rguid1, rguid2) \
  (!memcmp(&(rguid1), &(rguid2), sizeof(GUID)))

static int ctl2_find_guid(....)
{
  MSFT_GuidEntry *guidentry;
  ...
  if (IsEqualGUID(guidentry, guid)) return offset;
  ...
}

PVS-Studio warning: V512 A call of the 'memcmp' function will lead to underflow of the buffer 'guidentry'. oleaut32 typelib2.c 320

A pointer is written here as the first argument. As a result, the address of the pointer gets evaluated, which has no sense.

Correct variant:

if (IsEqualGUID(*guidentry, guid)) return offset;

The code of IronPython and IronRuby project (C#):

public static bool Equals(float x, float y) {
  if (x == y) {
    return !Single.IsNaN(x);
  }
  return x == y;
}

PVS-Studio warning: V3024 An odd precise comparison: x == y. Consider using a comparison with defined precision: Math.Abs(A - B) < Epsilon. FloatOps.cs 1048

It's not clear what is the point of a special check against NaN here. If the condition (x == y) is true, it means that both x and y and different from NaN, because NaN isn't equal to any other value, including itself. It seems that the check against NaN is just not necessary, and the code can be shortened to:

public static bool Equals(float x, float y) {
  return x == y;
}

The code of Mono project (C#):

public bool Equals (CounterSample other)
{
  return
    rawValue         == other.rawValue         &&
    baseValue        == other.counterFrequency &&   // <=
    counterFrequency == other.counterFrequency &&   // <=
    systemFrequency  == other.systemFrequency  &&
    timeStamp        == other.timeStamp        &&
    timeStamp100nSec == other.timeStamp100nSec &&
    counterTimeStamp == other.counterTimeStamp &&
    counterType      == other.counterType;
}

PVS-Studio warning: V3112 An abnormality within similar comparisons. It is possible that a typo is present inside the expression 'baseValue == other.counterFrequency'. System-net_4_x CounterSample.cs 139

How Do these Programs Work at all?

Looking through all the errors, it seems miraculous that all these programs generally work. Indeed, the comparison functions do a very important and responsible task in program.

There are several explanations of why these programs work despite these errors:

In a lot of functions, only a part of the object is compared incorrectly. The partial comparison is enough for most of the tasks in this program.
There are no situations (yet) when the function works incorrectly. For example, this applies to the functions that aren't protected from null pointers or those, where the result of the memcmp function call is placed into the variable of char type. The program is simply lucky.
The reviewed comparison function is used very rarely or not used at all.
Who said that the program is working? A lot of programs really do something wrong!

Recommendations

I demonstrated how many errors can be found in the comparison functions. It follows that the efficiency of these functions should be checked with unit-tests by all means.

It is really necessary to write unit-tests for the comparison operators, for Equals functions and so on.

I am quite sure that there was such an understanding among programmers before reading this article, that unit tests for such functions is extra work and they won't detect any errors anyway: the comparison functions are just so simple at the first glance... Well, now I showed the horror that can hide in them.

Code reviews and using static analysis tools would also be a great help.

Conclusion

In this article we mentioned a large amount of big-name projects that are developed by highly qualified experts. These projects are thoroughly tested using different methodologies. Still, it didn't stop PVS-Studio from finding errors in them. This shows that PVS-Studio can become a nice complement to other methodologies used to improve the quality and reliability of the code.

↧

Accelerating Deep Learning Inference with Intel® Processor Graphics

May 17, 2017, 11:01 am

Latest and popular articles on Intel Technologies

≫ Next: Nightdive turns games of the past into a bright future…virtually

≪ Previous: The Evil within the Comparison Functions

Introduction

This paper introduces the tools recently made available to accelerate your AI inference in edge devices on Intel® Processor Graphics solutions across the spectrum of Intel SOCs. In particular, the paper covers Intel’s Deep Learning Deployment Toolkit and how it helps to increase the performance and maybe even more importantly the performance per watt of AI Inference in your product. The paper will also introduce the underlying Compute Library for Deep Neural Networks(clDNN), a Neural Network kernel optimizations written in OpenCL and available in open source.

Target audience: Software developers, platform architects, and academics seeking to maximize deep learning performance on Intel® Processor Graphics.

Note: Artificial Intelligence (AI), Machine Learning (ML), Deep Learning (DL) are used interchangeably in this paper. The larger field is artificial intelligence. This article is focusing on the Machine Learning piece of AI or more specifically the multi-layered neural networks form of Machine Learning called Deep Learning.

Background on AI and the Move to the Edge

Artificial Intelligence or AI has been a domain of research with fits and starts over the last 60 years. AI has really taken off in the last 5 years with the availability of large data sources, growth in compute engines and modern algorithms development based on neural networks. Machine learning or the many layers of deep learning are propelling AI into all parts of modern life as it is applied to varied usages from computer vision to identification and classification from natural language processing to forecasting. These base level tasks help to do decision making in many areas of life.

As a data scientist Andrew Ng noted AI is the next electricity: “Just as electricity transformed almost everything 100 years ago, today I actually have a hard time thinking of an industry that I don’t think AI will transform in the next several years.”

This wave of AI work began in the cloud running on servers. While AI usage in the cloud continues to grow quickly, there is a trend to perform AI inference on the edge. This trend to devices performing machine learning locally versus relying solely on the cloud is driven by the need to lower latency, persistent availability, lower costs and address privacy concerns. We are moving to the day that devices from phones and PCs to cars, robots and drones to embedded devices like refrigerators and washing machines all will have AI embedded in them. As Andrew Ng pointed out, companies in all industries are figuring out their AI strategy. Additionally, the field of AI is rapidly changing, with novel topologies being introduced on a weekly basis. This requires product developers to design for flexibility to modify AI software frequently in their products.

Intel® Processor Graphics as a Solution for AI Inference on the Edge

Intel Processor Graphics (Intel® HD Graphics, Intel® Iris® Graphics and Intel® Iris® Pro Graphics) provides a good balance of fixed function acceleration with programmability to deliver good performance/power across the emerging AI workloads with the flexibility to allow customers to adopt the latest AI topologies. Specifically, Intel® Processor Graphics provides the characteristics of:

Ubiquity– Intel Processor Graphics as part of Intel’s SOCs have already shipped in over a billion devices ranging from servers to PCs to embedded devices. This makes it a widely available engine to run machine learning algorithms.

Scalability– As AI becomes embedded in every product, the design points of power and performance will vary greatly. Intel Processor Graphics is available in a broad set of power/performance offerings from Intel® Atom™ processors, Intel® Core™ processors, and Intel® Xeon™ processors.

Leadership in Media– More than 70% of internet traffic is video. One of the top usages for AI in devices will be computer vision. Along with compute for AI, encoding, decoding and processing video will be employed concurrently. Intel® Quick Synch Video technology is based on the dedicated media capabilities of Intel® Processor Graphics to improve the performance and power efficiency of media applications, specifically speeding up functions like decode, encode and video processing. See Intel’s Quick synch Video page to learn more. This is paired with the Intel® Media SDK and Intel® Media Server Studio - API that provides access to hardware-accelerated codecs on Windows* and Linux*.

Powerful and Flexible Instruction Set Architecture (ISA) - The Instruction Set Architecture (ISA) of the Processor Graphics SIMD execution units is well suited to Deep Learning. This ISA offers rich data type support for 32bitFP, 16bitFP, 32bitInteger, 16bitInteger with SIMD multiply-accumulate instructions. At theoretical peak, these operations can complete on every clock for every execution unit. Additionally, the ISA offers rich sub register region addressing to enable efficient cross lane sharing for optimized convolution implementations, or efficient horizontal scan-reduce operations. Finally, the ISA provides efficient memory block loads to quickly load data tiles for optimized convolution or optimized generalized matrix multiply implementations.

Memory architecture– When using discrete graphics acceleration for deep learning, input and output data have to be transferred from system memory to discrete graphics memory on every execution – this has a double cost of increased latency and power. Intel® Processor Graphics is integrated on-die with the CPU. This integration enables the CPU and Processor Graphics to share system memory, share memory controller, and share portions of the cache hierarchy. Such a shared memory architecture can enable efficient input/output data transfer and even “zero copy” buffer sharing. Additionally, Intel has sku offerings with additional package integrated eDRAM.

Intel’s Deep Learning Deployment Toolkit

To utilize the hardware resources of Intel® Processor Graphics easily and effectively, Intel provides the Deep Learning Deployment Toolkit. This toolkit takes a trained model and tailor it to run optimally for specific endpoint device characteristics. In addition, it delivers a unified API to integrate inference with application logic.

The Deep Learning Deployment Toolkit comprises two main components: the Model Optimizer and the Inference Engine (Figure 1).

Figure 1: Model flow through the Deep Learning Deployment Toolkit

Image may be NSFW.
Clik here to view.

Model Optimizer is a cross-platform command line tool that performs static model analysis and adjusts deep learning models for optimal execution on end-point target devices. In detail, the Model Optimizer:

Takes as input a trained network in a framework specific format (for example from the Caffe* framework)
Performs horizontal and vertical fusion of the network layers
Prunes unused branches in the network
Quantizes weights
Produces as output an Internal Representation (IR) of the network - a pair of files that describe the whole model:
- Topology file - an XML file that describes the network topology
- Trained data file - a .bin file that contains the weights and biases binary data

The produced IR is used as an input for the Inference Engine.

Inference Engine is a runtime that delivers a unified API to integrate the inference with application logic. Specifically it:

Takes as input an IR produced by the Model Optimizer
Optimizes inference execution for target hardware
Delivers inference solution with reduced footprint on embedded inference platforms.

The Deep Learning Deployment Toolkit can optimize inference for running on different hardware units like CPU, GPU and will support FPGA in future. For acceleration on CPU it uses the MKL-DNN plugin – the domain of Intel® Math Kernel Library which includes functions necessary to accelerate the most popular image recognition topologies. It's planned to add FPGA support using plugin for Intel® Deep Learning Inference Accelerator . For GPU, the Deep Learning Deployment Toolkit has clDNN– a library of OpenCL kernels. The next section explains how clDNN helps to improve inference performance.

Compute Library for Deep Neural Networks (clDNN)

clDNN is a library of kernels to accelerate deep learning on Intel® Processor Graphics. Based on OpenCL, these kernels accelerate many of the common function calls in the popular topologies (AlexNet*, VGG*, GoogleNet*, ResNet*, Faster-RCNN*, SqueezeNet* and FCN* are supported today with more being added). To give developers the greatest flexibility and highest achievable performance Intel is delivering:

1) The full library as open source so developers and customers can use existing kernels as models to build upon or create their own hardware specific kernels running deep learning.

2) Compute extensions to expose the full hardware capabilities to developers.

During network compilation clDNN breaks the workflow optimizations into in three stages described below.

Figure 2: Model flow from topology creation to execution

Image may be NSFW.
Clik here to view.

Network Compilation and the 3 Stages of clDNN

Stage 1: Network Level

Fusing is one of most efficient ways to optimize graphs in DL. In clDNN, we have created 2 ways to perform fusing – one more automated to run on a single accelerator (naive inference client) and the second for a more experienced data scientist to tune to run across multiple accelerators (Set of fused primitives). In more detail:

Naive inference client – you have a workload and want it to be run on one accelerator. In this case user can ask clDNN to perform fusing during network compilation
Set of fused primitives – in this approach, the user who is experienced in tuning models, does the graph compilation with pattern matching in his application to balance the work across various accelerators. For this approach we expose already fused primitives

Currently clDNN supports 3 fusions: convolution with activation, fully connected with activation and deconvolution with activation fused primitives. Additional fusions are in development.

Another part of network level optimizations is the padding implementation. Choosing OpenCL buffers as data storage requires padding by either adding conditions inside the kernels or providing a buffer with a frame around the input data. The first approach would consume the full register budget, which would constrain the available registers for the convolutions kernels, negatively impacting performance.

Experiments have shown that adding the proper aligned frame around the buffers provides better performance results, when it is done as follows:

Consider network with two primitives A and B. B contains padding equals 2:

Figure 3: Padding Example

Image may be NSFW.
Clik here to view.

This requires adding a frame with size 2x2:

Image may be NSFW.
Clik here to view.

To add the frame we need to add the reorder primitive:

Image may be NSFW.
Clik here to view.

and fuse this with the A primitive:

Image may be NSFW.
Clik here to view.

Stage 2: Memory Level

As soon as the topology is defined and data is provided, the network is ready to compile. The first step of network compilation is the determination of the activation layout. In DNN’s, data stored in hidden layers is defined as 4D memory chunks. In clDNN, the layout description is defined with 4 letters:

B - number of patches in batch
F - number of feature maps or channels
X - spatial or width
Y - spatial or height

Figure 4: Example of a memory chunk

Image may be NSFW.
Clik here to view.

Figure 5: For most cases the most optimal layout is BFYX

Image may be NSFW.
Clik here to view.

If data type is half precision (fp16), the batch size is greater or equal to 32 and the convolutions are using split parameter (depth split like in Alexnet* convolutions), then the clDNN layout is YXFB.

Figure 6: YXFB layout

Image may be NSFW.
Clik here to view.

During memory level optimization, after kernels for every primitive have been chosen, clDNN runs weights optimizations, which transform user provided weights into ones that are suitable for the chosen kernel. Weights for convolutions are stored in:

Figure 7: Weights for convolutions in IS_IYX_OSV16

Image may be NSFW.
Clik here to view.

For fully connected networks depending on data type (fp16/fp32), weights can be transformed into one of the following:

Figure 8: memory layouts for optimized fully connected primitives

Image may be NSFW.
Clik here to view.

Stage 3: Kernel Level:

To enable modern topologies in an efficient way on Intel® Processor Graphics, a focus on convolution implementation is needed. To do this, clDNN uses output blocks that enable each thread on the Intel® Processor Graphics to compute more than one output at a time. The size of the block depends on the convolution stride size. If the block size is greater than the stride, then clDNN uses shuffle technology to reuse weights and inputs within the neighborhood. This approach yields 85% of performance peak on Alexnet* convolution kernels. All reads and writes are using more optimal block_read/block_write functions. A similar approach is applied to achieve high efficiency running deconvolution and pooling primitives.

Performance Numbers:

Image may be NSFW.
Clik here to view.

The Intel® Iris® Pro Graphics provides more peak performance and the Intel® HD Graphics provides more performance/watt.

Details:

Batch1 FP16

Intel® HD Graphics 530 (blue) configuration: Intel® Core™ i5-6500 CPU @ 3.20GHz, Intel® HD Graphics 530, fixed frequency - 1000 Mhz, CentOS 7.2 kernel 4.2, OpenCL driver: Intel SRB 4.1., Memory: 2x8GB DDR4 2133

Intel® Iris® Pro Graphics 580 (orange) configuration: Intel® Core™ i7-6770HQ CPU @ 2.60GHz, Intel® Iris® Pro Graphics 580, fixed frequency – 950 Mhz, CentOS 7.2 kernel 4.2, OpenCL driver: Intel SRB 4.1., Memory: 2x4GB DDR4 2133

Topologies: AlexNet*, VGG16-FACE*

Memory Bandwidth vs Compute

In topologies with memory bound sequences (like Alexnet*), we can increase the batch size, reusing weights in multi batches to gain greater images/second performance. But for topologies that are compute bound (like VGG16-FACE*) even with single image on input, we see little benefit with larger batch sizes:

Image may be NSFW.
Clik here to view.

Systems used for these measurements are configured in the same way as at previous pair of benchmarks.

Power efficiency

In some power constrained workloads, it can be more important to maximize performance/watt versus absolute performance. Since decreasing the clock rate causes the power to decrease linearly but voltage is squared, the GPU performance per Watt is increasing linearly as frequency is lowered. Intel® HD Graphics can show a better FPS/Watt ratio running with lower frequency on lower power states. Also different Intel processor products offer different leakage and power behavior. For example the 6th and 7th Generation Intel “Y skus” such as the Intel® Core™ m7-6Y75 Processor with Intel® HD Graphics 515 provide lower peak performance but more performance / watt. Through the combination of selecting the right Intel SOC across a wide range of power and performance points and choosing the appropriate frequency, the developer has the ability to tune to a broad range of workloads and power envelopes.

Conclusion:

AI is becoming pervasive, driven by the huge advancements in machine learning and particularly deep learning over the last few years. All devices on the edge are moving toward implementing some form of AI, increasingly performed locally due to cost, latency and privacy concerns. Intel® Processor Graphics provides a good solution to accelerate deep learning workloads. This paper described the Deep Learning Model Optimizer, Inference Engine and clDNN library of optimized CNN kernels that is available to help developers deliver AI enabled products to market. For more information or to get started, download the tools or libraries from the links below:

Appendix A: List of Primitives in the clDNN Library

Compute Library for Deep Neural Networks (clDNN) is a middle-ware software for accelerating DNN inference on Intel® HD and Iris™ Pro Graphics. This project includes CNN primitives implementations on Intel GPUs with C and C++ interfaces.

clDNN Library implements set of primitives:

Compute Primitives
- Convolution
- Deconvolution
- Fully connected (inner product)
- Element-Wise
Pooling
- average
- maximum
- ROI pooling
Normalization
- LRN across/within channel
- Normalize
- Batch-Normalization
Activation
- rectified linear unit (RelU)
Auxiliary
- Crop
- Concantenation
- Simpler NMS
- Prior box
- Detection output
- Reorder
Softmax

With this primitive set, user can build and execute most common image recognition, semantic segmentation and object detection networks topologies like:

AlexNet*
GoogleNet*
ResNet*
VGG16-FACE*
Faster-RCNN*
FCN*

↧

Nightdive turns games of the past into a bright future…virtually

May 17, 2017, 4:39 pm

Latest and popular articles on Intel Technologies

≫ Next: Tall-and-Skinny and Short-and-Wide Optimizations for QR and LQ Decompositions

≪ Previous: Accelerating Deep Learning Inference with Intel® Processor Graphics

Image may be NSFW.
Clik here to view. Nightdive turns games of the past into a bright future…virtually

Many game companies open up an office space, get a development team together to work in that office, grind away for a couple of years to create a new intellectual property (IP), then put the product up for sale through retail outlets and digital-distribution sites, such as Steam. Hopefully, profit follows, so they can do it all over again.

Nightdive Studios, on the other hand, took a drastically different path, and its website reveals that core mission: “Bringing lost and forgotten gaming treasures back from the depths…”

By acquiring the rights to already-released games, updating them to work on contemporary platforms, and offering the revamped games through direct-distribution outlets, Nightdive can avoid having to lease office space, and it doesn’t need to employ dozens of local employees to facilitate the work. The development company operates a virtual office environment, which means the people involved in updating and coding the games don’t need to move from their respective countries, or even their homes. All of that contributes to Nightdive’s profits, which the studio uses to, indeed, do it all over again…and again…and again.

A shocking trip

Nightdive was founded in late 2012 by Stephen Kick, now Nightdive’s CEO. Back then, Kick was a character artist with Sony Online Entertainment, but was getting a little tired of making games for others. He decided to embark on a world trip to find new inspiration, and, like many travelers, he brought some games with him — in this case, some classics from his youth.

Image may be NSFW.
Clik here to view. Stephen Kick
Stephen Kick. Image Credit: Nightdive Studios.

"One night, I was playing — or attempting to play — System Shock 2, and I couldn’t get the game running,” Kick explains. “I went online, attempted to purchase the game (on GOG.com), and I discovered there was no legal way to commercially buy the product. So, I did some digging, and discovered that the IP had been transferred to an insurance company after Looking Glass Studios had gone out of business. I approached [the insurance company] about digitally re-releasing the game on GOG, Steam, and other digital platforms, and that was pretty much the birth of Nightdive Studios."

Kick says the success with the System Shock 2 re-release was the first step for the newborn company, but it quickly led to “finding other games that were lost to time,” and following the same procedure to bring them back to market. As the classic song goes, “Everything old is new again,” and Nightdive is proving that to be quite true with its retro games. The studio has over 100 products in its catalog — available on Steam, GOG, and Humble Bundle’s Humble Store— including, The 7th Guest, Shadow Man, Space Rogue, and the Wizardry series.

"Our inspiration really lies in all the games that we grew up with and that we remember fondly," Kick says, "And our desire to replay those games, preserve them for future generations to enjoy, and just to continue, I guess, the stewardship of making sure these games are available for everybody to play again."

Out of the fog

In March 2017, Nightdive brought out its latest release: Turok 2: Seeds of Evil. This first-person shooter debuted in 1998 on the Nintendo 64 console, courtesy of Acclaim Entertainment, and ported to Windows a year later. Nightdive has already released its Turok 2 update on PC, and is also working on a port to the Xbox One console.

Image may be NSFW.
Clik here to view. Split-screen multiplayer action in Turok 2
Split-screen multiplayer action in Turok 2. Image Credit: Nightdive Studios

One of the features Nightdive has included is for Turok 2 to be playable on almost any PC. That enables players on a wide variety of systems to still be able to enjoy a stable game with high visual fidelity.

"It’s interesting…we worked in cooperation with Intel, using their toolsets; Intel provides a variety of different software tools to optimize your game performance," says Larry Kuperman, Nightdive’s director of business development. "One of the things we found with the Intel set, we were able to make sure that [Turok 2] would play on the widest spectrum of computers available, so that if you wanted to fire up Turok on your laptop on the way home, it would play smoothly."

Another change Kuperman points out has to do with the game’s viewing distance. Because of the constraints of the processors in the late ’90s, the original game-developers used fog to limit the distance the player could see ahead, which enabled them to provide highly detailed graphics at a relatively short distance. However, nearly 20 years on, with the increase of CPU power and video cards, distance-limiting fog wasn’t needed.

Image may be NSFW.
Clik here to view. Larry Kuperman
Larry Kuperman. Image Credit: Nightdive Studios

"We were able to roll back the fog, and give the game a whole new visual treatment,” Kuperman explains. “These are not games that are intended to compete with the highest-end, highest-requirement games out there, but, visually, they’re certainly appealing."

Another Nightdive development team is working on a reboot of System Shock. Nightdive has managed to acquire full rights to the game, so the studio is rebuilding it from the ground up using the Unreal Engine.

"The ultimate goal for us acquiring the license,” Kick says, “is to be able to reintroduce the franchise to the current generation of gamers. That really kicked off around the end of June [2016], when we launched our Kickstarter. We were able to raise 150% of our goal for a total of $1.35 million in order to faithfully reboot the first game in the series."

Their virtual reality

Nightdive’s virtual office environment means that the studio has people all around the world working on projects. As Kick explains, this means that development happens on pretty much a 24/7 basis, with tools (such as GitHub, JIRA, and Slack) enabling collaboration and communication across the team. Software enables managers to track each person’s contribution to make sure everyone is generating what they need to for the project. Kick bemoans some of the tradeoffs — such as the lack of in-office socializing and camaraderie — but Kuperman counters that the distributed office means there are no complaints that a co-worker cracks his knuckles or plays her music too loudly.

Kuperman feels that this is a great time to be in game development, with changes to the creation process enabling end-to-end benefits. With crowdfunding platforms, such as Kickstarter and Fig, it’s easier for a studio to work on a project without needing to make a deal (and share future revenue) with a publisher. Game engines, such as Unity and Unreal, are incredibly powerful, but also free to use until you start selling the product you’ve created. And there are a bunch of digital-sales platforms on which to retail a product, so a developer can self-publish quite easily. Even if the developer opts to work with a publisher to bring a product to market, Kuperman says there are still benefits from those tools.

"A developer can be relatively self-sufficient and come to the publisher, saying 'Look at what I’ve produced so far. Is this something that you’d be interested in?' So you have all those things out there — you have a very robust ecosystem for games development now."

↧

Tall-and-Skinny and Short-and-Wide Optimizations for QR and LQ Decompositions

May 21, 2017, 10:28 pm

Latest and popular articles on Intel Technologies

≫ Next: How F5 Networks Profiles for Success

≪ Previous: Nightdive turns games of the past into a bright future…virtually

Intel® Math Kernel Library (Intel® MKL) 2017 updates 3 and later versions provide optimized functionality for calculating QR decompositions of tall-and-skinny (TS) matrices, and for calculating LQ decompositions of short-and-wide (SW) matrices.

New routines have been added to Intel MKL to allow for the calculations of QR and LQ factorizations using the TS/SW modifications described above for appropriate matrix sizes. These routines are generalized for all sizes (i.e. they will also work on matrices that are not TS/SW, as they include paths to return to the generic routines when the matrix size is not sufficiently TS/SW). Details of the new routines and parameter specifications can be found in the Intel MKL Developer Reference (https://software.intel.com/en-us/articles/intel-math-kernel-library-documentation). The routines to reference are listed below:

New TS/SW Routine

Generic Routines

QR Decomposition

?geqr
?gemqr

LQ Decomposition

?gelq
?gemlq

QR Decomposition

?geqrf
?ormqr (real)
?unmqr (complex)

LQ Decomposition

?gelqf
?ormlq (real)
?unmlq (complex)

A general overview of the TSQR algorithm is provided into TSKB_QRLQ.pdf file attached. In addition, this pdf provides example code to call the QR decomposition of a matrix using the new TSQR routines.

The following charts show the speedup of DGEQR compared to DGEQRF. Performance results of ?GELQ compared to ?GELQF routines show similar speedup, thus are not displayed here

The first chart shows these speedups on an Intel® Xeon® CPU E5-2699 v4 processor,

Image may be NSFW.
Clik here to view.

and the second on an Intel® Xeon Phi™ 7250 processor.

Image may be NSFW.
Clik here to view.

↧

How F5 Networks Profiles for Success

May 22, 2017, 2:31 pm

Latest and popular articles on Intel Technologies

≫ Next: Tutorial: Unlock Intel® GPU capabilities with Intel OpenCL™ Extensions

≪ Previous: Tall-and-Skinny and Short-and-Wide Optimizations for QR and LQ Decompositions

Image may be NSFW.
Clik here to view.When Seattle-based F5 Networks, Inc. needed to amp up its BIG-IP DNS* solution for developers, it got help from Intel.

Business users expect their applications to be fast, secure, and always available. Anything less is unacceptable. That’s why F5 gives the developers who build those applications the tools they need to deliver maximum speed, security, and availability.

The company’s BIG-IP DNS improves the performance and availability of applications by sending users to the closest or best-performing physical, virtual, or cloud environment. It also hyperscales and secures developers’ domain name service (DNS) infrastructure from distributed denial of service (DDoS) attacks and delivers a real-time domain name system security extensions (DNSSEC) solution that protects against hijacking.

“Intel® VTune™ Amplifier helped us identify potential performance bottlenecks in the design and engineering of our high-performance networking systems,” explained James Hendergart, strategic initiatives director for F5 Networks. “We worked with the Intel VTune Amplifier team for about a month. They were very responsive to our needs, adding the capability to run Intel VTune Amplifer remotely and in headless environments. It was a great collaboration between Intel and F5.”

Get the whole story in our new case study.

↧

Tutorial: Unlock Intel® GPU capabilities with Intel OpenCL™ Extensions

May 22, 2017, 6:52 pm

Latest and popular articles on Intel Technologies

≫ Next: 2017 Intel® Level Up Contest Closed

≪ Previous: How F5 Networks Profiles for Success

Download tutorial code here.

Based on an IWOCL 2017 tutorial Unlock Intel GPUs for High Performance Compute, Media and Computer Vision.

Introduction

Intel provides many extensions to the Khronos OpenCL(tm) standard to help you utilize the full range of hardware capabilities.

Subgroups
Video Motion Estimation (VME)
VEBox

These extensions are not standalone. They build upon each other.

Image may be NSFW.
Clik here to view.

The tutorial code focuses on subgroups, VME, and VEBox. Image processing and sharing extensions are also used in the tutorials code as solution components.

For more information on Intel extensions: https://software.intel.com/en-us/articles/opencl-intel-graphics-extensions

Subgroups

Image may be NSFW.
Clik here to view.

Intel subgroups are

subset of a work group
equal to the SIMD width (8,16,or 32)
in the same hardware thread of the EU
share thread resources (including register space)
execute together

Intel subgroup functions add

barrier, broadcast, reduce, scan
shuffle
block read/write

More info: Spec

Video Motion Estimation (VME)

Intel Gen GPUs accelerate the search for motion in video. This is a core codec component but can also be used in a wide range of applications from custom bitrate control to computer vision.

Image may be NSFW.
Clik here to view.

VEBox

Intel GPUs contain a specialized IP block designed for video enhancement operations.

Image may be NSFW.
Clik here to view.

For more info:

OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos

↧

2017 Intel® Level Up Contest Closed

May 18, 2017, 4:53 pm

Latest and popular articles on Intel Technologies

≫ Next: Modern VR is ‘like the dog who catches a car but doesn’t know what to do with it’

≪ Previous: Tutorial: Unlock Intel® GPU capabilities with Intel OpenCL™ Extensions

Image may be NSFW.
Clik here to view. Intel® Level Up Contest

Thank you for your interest in the 2017 Intel® Level Up Game Dev Contest. The contest closed on May 9, 2017.

↧

Modern VR is ‘like the dog who catches a car but doesn’t know what to do with it’

May 21, 2017, 11:45 am

Latest and popular articles on Intel Technologies

≫ Next: Create Draft Article

≪ Previous: 2017 Intel® Level Up Contest Closed

Image may be NSFW.
Clik here to view. Kim Pallister, VR expert at Intel, and sci-fi author Austin Grossman at GamesBeat Summit 2017

The modern virtual reality market is new, but the idea of virtual worlds has existed in fiction for decades.

Austin Grossman is a sci-fi author who has written books such as the techno-thriller You: A Novel. He also helped write games like PC classics Deus Ex and System Shock as well as the more recent Dishonored series. At the GamesBeat Summit today in Berkley, California, Grossman discussed in an on-stage interview with Kim Pallister, VR expert at Intel, how sci-fi stories from our past can tell us about the future of virtual reality — and how we’re struggling how to deal with it.

Grossman brought up novels like Snow Crash and Ready Player One, which both featured VR social spaces. These ideas used to be science fiction, but modern virtual reality devices and online games are maker them closer to reality. But Grossman says that we’re like the dog who catches a car but doesn’t know what to do with it.

VR isn’t just an entertainment experience that people use for 10 minutes at a time, Grossman said, in novels. It is an integral part of society that people use for work as much as play. It’s also a tool used to escape from dystopian nightmares. In Ready Player One, many people are living in ghettos of skyscrapers made of trailer park homes. Its protagonist spends as much time in virtual reality as possible, using it to have access to things he doesn’t have in the real world: friends, education, and adventures.

This could present a danger to our VR future. What if people use the coming virtual worlds to escape the real one? Could we potentially forsake the planet and our ties to it in favor of a more palpable digital illusion?

So, the future of VR presented in fiction could be an unsettling one. But fiction hasn’t gotten everything right. In The Matrix, people need to be in pods or other constrictive devices to be connected to virtual worlds (and that’s besides the fact that most humans were imprisoned and having their energy sucked out by evil robots). But we aren’t using neural interfaces.

“It’s a wonderful thing that we got wrong,” Grossman said. Actual VR has players moving around. He says that this makes VR more exciting and less of a terrifying dystopia.

Grossman noted that world-building is the key skill needed for making enjoyable VR experiences. To make a world for a novel, that takes him two or three years of planning. But for modern virtual reality games, more work goes into designing and programming the experience. Less attention is given to narrative, characters, and history. These are the things that make people fall in love with and to live in a fictional world.

Licensing IP is kind of a cheat, Grossman says. It gives you an immediate world that audiences love. VR designers need to make new worlds of its own. The recent Star Trek: Bridge Crew is a good example of this. Beyond the gameplay, people enjoy the game just because it lets them be in Star Trek.

Virtual reality has the potential to change people and how they relate to each other. Forcing us to interact with others in unique ways. But Grossman noted that he also looks forward to having VR teach him. He anticipates full-body-tracking, since a VR program could then teach him how to dance. That certainly sounds more pleasant than having machine overlords plugging us into a placating VR world while they suck energy from our imprisoned bodies.

↧

Create Draft Article

May 24, 2017, 10:24 am

Latest and popular articles on Intel Technologies

≫ Next: Publish Draft Article

≪ Previous: Modern VR is ‘like the dog who catches a car but doesn’t know what to do with it’

test

↧

Publish Draft Article

May 24, 2017, 10:35 am

Latest and popular articles on Intel Technologies

≫ Next: Test Moderation and Editing Published

≪ Previous: Create Draft Article

test

↧

Test Moderation and Editing Published

May 24, 2017, 2:08 pm

Latest and popular articles on Intel Technologies

≫ Next: Test Moderation and Editing

≪ Previous: Publish Draft Article

test

↧

Test Moderation and Editing

May 24, 2017, 2:15 pm

Latest and popular articles on Intel Technologies

≫ Next: Artificial Intelligence Powers Clinical Trials

≪ Previous: Test Moderation and Editing Published

test

↧

Artificial Intelligence Powers Clinical Trials

May 24, 2017, 10:00 am

Latest and popular articles on Intel Technologies

≫ Next: Getting Started with Ubuntu* Core on an Intel® IoT Gateway

≪ Previous: Test Moderation and Editing

Image may be NSFW.
Clik here to view.

Clinical Trials

Clinical trials (CT) enable us to understand, diagnose, prevent, and treat diseases. Clinical research has led to making diabetes manageable, and prolonged the lives of AIDS and cancer patients. CT is a fundamental tool of modern medicine; it is the cornerstone of the drug development process.

Image may be NSFW.
Clik here to view.
Figure 1: Clinical trial process

The essence of conducting clinical trials is to evaluate the efficacy of new treatments and therapies. On average, it takes about 10 to 15 years to convey a medication from introductory revelation to the hands of patients, and can cost billions of dollars. Artificial intelligence (AI) can significantly reduce both the time and the cost by more than half. The rate of AI in these procedures permits organizations to create drugs with more precision, as compared to manual methodology. Experiments and analysis that would take human researchers weeks and months can be effectively conducted by AI within minutes.

A Historical Perspective

For centuries, medical studies and clinical approaches have been a combination of religious beliefs, magical perspectives, medicinal herbs, and some science.

Eighteenth Century

Back in 1796, smallpox was killing in the thousands, and tormenting others still alive with constant fear. Everyone from the rich to the poor was affected. While smallpox was killing people, Edward Jenner was busy making observations that milkmaids who had contracted cowpox, possibly from infected udders of cows, seemed to be immune to smallpox.

After collecting similar data points over several years, he performed a clinical trial. He scraped a pustule from a milkmaid, Sarah Nelmes, who had cowpox, and inserted the matter into a cut on the arm of his gardener’s son, a young boy called James Phipps. Six weeks later, he inoculated the boy with the smallpox virus. Jenner concluded that his hypothesis was correct when James Phipps did not get smallpox.

The word vaccination comes from vacca, Latin for cow. Jenner’s trials led to the discovery of vaccination. Jenner later conducted a mass vaccination, which prevented smallpox. This simple trial conquered epidemics from typhoid, to polio, to measles in later years. A clinical trial had saved thousands of lives!

Nineteenth Century

Lightner Witmer developed practical work in clinical psychology at the University of Pennsylvania. Clinical psychology aims at preventing and relieving psychological distress and promoting personal development. It involves science and clinical knowledge to understand human patterns. Patients would have their skull shape (phrenology) and face (physiognomy) examined for the doctor to study their personality.

Such scientific data analysis methods and predictions based on information for an individual patient grew steadily in university laboratories by the late 1800s. Mental distress was then in the domain of psychiatrists,while psychologists nurtured the notion of non-curable disorders based on the size and shape of human anatomy as pure science. This changed when Lightner Witmer, then head of the psychology department at the University of Pennsylvania, leveraged the knowledge accumulated thus far to treat a young boy who had trouble with spelling. In 1896 he opened the first psychological clinic at Penn dedicated to helping children with disabilities, and coined the term clinical psychology, defined as the study of individuals, by observation or experimentation, with the intention of promoting change.

An effective use of insights hidden in data derived from patients had successfully differentiated treatable psychological issues from non-curable mental disorders. In fact, two clinical intelligence tests, Army Alpha (verbal skills) and Army Beta (nonverbal skills) were conducted on large groups of recruits during World War I, the success of which led to assessments becoming the core discipline of clinical psychology, eventually leading to treatments.

Clinical trials have continued to evolve since then, and have proven to be a miracle medical invention that has led to numerous lifesaving medicines we know today.

Clinical Trials Today

The complex drug pipeline process poses an enormous challenge: To move through expensive and time-consuming clinical trials efficiently and rapidly.

A core part of clinical evaluation involves the recruitment and selection of eligible patients who go through training programs for relevant clinical trials. To select and recruit eligible patients for clinical trials, clinicians manually analyze medical big data (MBD) and face multiple challenges consisting of the amount of medical data (volume), the number of types of medical data (variety), and the speed with which to process medical data (velocity) to determine inclusion of a patient into a clinical trial.

Image may be NSFW.
Clik here to view.
Figure 2: Clinical trial patient

Challenges

Volunteer based: A clinical trial relies heavily on volunteers willing to participate in studies.
Selection process: This process proves difficult to quickly analyze medical big data against a list of eligibility criteria, risking 60 percent of the chances of a miss to select eligible patients for clinical trials. After spending a considerable amount of time in the selection process, only one in hundreds of new treatments and therapies that achieve Phase I clinical trials is reported to wind up in a genuine treatment.
Cost: The associated cost is estimated at USD 124 million over a decade to complete per new drug per candidate, with half of this time spent in the recruitment of patients, clinical investigators, and in setting the environment for controlled clinical trials. A good example can be found in the study of multicenter randomized controlled trials, where clinical researchers spent about 86.8 staff hours, and over USD 100 in the recruitment of each randomized participant.
Precision: Each examination is complex and could require many individuals to complete it. Besides, each patient is different.
Long-term investment: Reported financial figures in medical journals support the argument that clinical trials involve a long-term investment, with an estimated cost of USD 12 billion on average, and take somewhere between 10 and 15 years to convey new treatments and therapies to the global market. Whereas the actual cost of how much the healthcare and pharmaceutical industries spend on drug development and in conveying new treatments and therapies to the world stage remains debatable, earlier research estimates that it costs somewhere around USD 161 million to USD 2 billion to convey new treatments and therapies to the market. Despite the variations in financial terms, the cost of clinical trials has been reported to have risen substantially, at the rate of 7.4 percent, which is higher than the estimated inflation rate for the past two decades.

Image may be NSFW.
Clik here to view.
Figure 3: Source: The Medical Futurist*

Though statistics have enabled us to design experiments precisely and to minimize errors in decision making, AI has the capacity to conduct such data analysis at the next level, enabling us to leverage not just a few hundred pieces of patient data, but millions. AI enables researchers to crunch enormous amounts of data within days or weeks, thereby reducing the heavy cost incurred in the pharmaceutical creation process. The result can also be customized to the individual, considering their own body's needs.

So, What is Artificial Intelligence?

At an abstract level, artificial intelligence is considered a science that is concerned with the computational understanding of intelligent behavior, with the development of medical artifacts that exhibit such behavior. This behavior consists of the ability of the computer to simulate human-level cognitive performance such as visual perception, speech recognition, decision making, and language translation.

Clinical Trials Powered by Artificial Intelligence

The rising cost of clinical trials and the difficulties involved in developing methodologies to acquire, analyze, and extract knowledge from medical big data in solving complex clinical problems accounts for the advancement in medical artificial intelligence (MAI).

Complementing people’s intelligence with machine intelligence, this augmented intelligence has an exponentially high impact. Machine learning can assist clinicians in their everyday clinical tasks, such as data manipulation and knowledge extraction, diagnosis formulation and the making of therapeutic decisions to predict clinical outcomes, and to improve the quality and lower the cost of clinical trials for better patient care.

Patient Recruiting and Data Collection

Most clinical trials today are led without direct information from patients. Most information is being collected by third-party suppliers amid patient visits. With the invention of mobile, Internet of Things (IoT), and especially wearables, billions of individuals are now conveying important information effortlessly. This phenomenon offers a way to capture relevant information from patients in a continuous and convenient way. With the touch of a button, patients can choose to directly share their information for clinical trials over their mobile devices anywhere and everywhere. Additionally, the information captured is much more contextual, precise, and high quality; something we couldn’t even imagine with manual clinical trials.

Continuous Improvement

Clinical trial processing systems are gradually moving to the cloud, with millions of mobile data points transmitting information, and custom frameworks analyzing the data. This lends a way to run continuous and self-learning trials with greater precision.

Shared Resource Pool with Crowd Sourcing

Patient data can now be shared among multiple clinics through the cloud infrastructure, making it more enticing for patients to participate in trials across the globe.

Ensure Adherence

Given that the local recordings made over mobile devices are continuously transmitted to the cloud, it is now possible for clinics to catch anomalies in drug intake patterns among patients, in a real-time manner, and to even remind the patient if they forget to take their medication.

Predict Drug Effectiveness

Not every human has the same body type; hence, different people can react differently to the same medication. Computerized reasoning is an effective method for anticipating drug results, since it treats a human gene with the entirety of all the collaborating qualities. With AI it is conceivable to predict which patients with a particular disease would benefit the most with a drug.

Applications and Success Stories

Web-Based Selection Process

The application of artificial intelligence in clinical trials is enormous. Experts have created a web-based system that correctly selects and assigns cancer patients to clinical trials within 15 to 30 minutes and takes between 10 and 20 minutes to add new trials. The system is built to accommodate an increase in the number of patients selected for clinical trials, and suggests additional medical tests, while finding the most efficient test-ordering sequence that reduces the cost of recruitment.

ATACH-II*—A Mobile App

Image may be NSFW.
Clik here to view.Application of artificial intelligence in clinical trials and medical research continues to grow. In a Phase-III clinical trial funded by the National Institute of Neurological Disorders and Stroke, the antihypertensive treatment in acute cerebral hemorrhage (ATACH-II*), in collaboration with MentorMate*, designed a mobile phone app called the ATACH-II app. The app provides assistance with pre-screening, patient eligibility assessment, and randomization in a five-year multi-center, randomized, controlled, Phase-III trial to evaluate the efficacy of early, intensive antihypertensive treatment using intravenous nicardipine for acute hypertension in subjects with spontaneous supratentorial ICH.

Image may be NSFW.
Clik here to view. In a Phase-II clinical trial conducted at 20 centers across the United States, the UK, Canada, and Germany, the Clot Lysis Evaluating Accelerated Resolution of Intraventricular Hemorrhage (CLEAR-IVH) trial investigators adopted the ATACH-II mobile phone app. The study recruited 52 patients diagnosed with intraventricular hemorrhage (IVH) with third or fourth ventricle obstruction. Each participant was given a thrombolytic, recombinant tissue plasminogen activator (tPA), via an extra ventricular catheter (EVD), in one of three dosing regimens over a three-day period.

AiCure*—Real-Time Non-Adherence Mobile Platform

Research shows that over 20 percent of all clinical trials fail because of non-adherence. Image may be NSFW.
Clik here to view. AiCure* has developed a powerful, scalable, and real-time advanced non-adherence mobile technology platform that visually confirms medication ingestion. AiCure’s clinically validated platform combines the power of artificial intelligence with deep learning, computer vision, machine learning, and predictive analytics to make sure the right patient is taking the right medication at the right time. The real-time data will assist healthcare and pharmaceutical companies involved in conducting clinical trials to evaluate the efficacy of new treatments and therapies.

Radiology

A skilled radiologist can use systems like AI tools to run tests, and focus on subjective, common sense, human decision making.

IBM Watson* for Oncology

Image may be NSFW.
Clik here to view.
Figure 4: IBM Watson* Analytics

After a patient’s tumor is sequenced by Quest Diagnostics*,Watson* analyzes the genetic alterations found to help identify potentially treatable mutations. This analysis can help oncologists identify targeted therapies for each patient’s individual cancer.

A Promising Future

AI is still in the infant stage of development and will not be able to replace a doctor.

Due to AI’s ability to understand natural language such as clinical notes, along with structured data such as dates and numbers, and the ability to generate hypotheses based on evidence, it is being considered as the Image may be NSFW.
Clik here to view. Artificial-Intelligence-Powers-Clinical-Trials-Promising-Future fourth industrial revolution, for which the healthcare and pharmaceutical industries are seen as the biggest beneficiaries.

Artificial intelligence holds even greater promise, not only in transforming clinical research, but also in reducing the cost associated with disease management, successful ageing, and the discovery and development of new medical innovations. For example, it cost USD 200 billion and €125 billion to manage non-communicable diseases in America and Europe, respectively, per year, while in the United States alone it costs three to five times more to provide support for successful ageing for someone aged 65 and over than for someone younger, a cost which is expected to decrease significantly with AI!

Additional Web Resources

https://www.cs.cmu.edu/~eugene/research/full/trial-knowledge.pdf

https://www.sciencedaily.com/releases/2016/04/160427095057.htm

https://www.nhlbi.nih.gov/studies/clinicaltrials

http://www.appliedclinicaltrialsonline.com/three-ways-clinical-trials-will-be-transformed-fourth-industrial-revolution?pageID=1

http://www.clinicalleader.com/doc/clinical-news-roundup-artificial-intelligence-ready-to-run-clinical-trials-0001

http://www.cbsnews.com/news/artificial-intelligence-making-a-difference-in-cancer-care/

http://www.thenakedscientists.com/articles/interviews/story-smallpox

↧

Getting Started with Ubuntu* Core on an Intel® IoT Gateway

June 2, 2017, 7:28 pm

Latest and popular articles on Intel Technologies

≫ Next: Making Smart Systems Smarter

≪ Previous: Artificial Intelligence Powers Clinical Trials

Introduction

This article demonstrates to new users how to install Ubuntu* Core on an Intel® IoT Gateway GB-BXTB-3825. The GB-BXTB-3825 is powered by an Intel® Atom™ E3825 dual-core processor which would be ideal for industrial applications such as data generation, data aggregation, and data analysis. Ubuntu* Core is a lightweight, transactional version of Ubuntu* designed for deployments on IoT devices. Snaps are universal Linux packages that are available to install on Ubuntu* Core to work on IoT devices and more. More information on the Gateway GB-BXTB-3825 is available at http://b2b.gigabyte.com/Embedded-System/GB-BXBT-3825-rev-10#ov. You can get detailed information on the Ubuntu* Core at https://www.ubuntu.com/core.

Hardware Requirements

The hardware components used in this project are listed below:

An Intel® IoT Gateway: GB-BXTB-3825
2 USB 2.0 or 3.0 flash drives with at least 2GB free space available
USB keyboard and mouse
A monitor with VGA or HDMI interface
A VGA or HDMI cable
A network connection with Internet access
An existing Linux* system is required to generate the RSA key (see Figure-1 below) and to login with SSH into the Ubuntu Core (Figure-11 and Figure-12 below).

Software Requirements

The software requirements used in this project are listed below:

Go to http://releases.ubuntu.com/16.04.2/ubuntu-16.04.2-desktop-amd64.iso and download the Ubuntu Desktop 16.04.2 LTS image
Download Ubuntu Core image
Bios update (see Updating Bios section below)
Rufus USB installer (optional)

Generate a Host SSH Key

The first step is to create an Ubuntu SSO account from https://login.ubuntu.com. The account is required to create the first user on an Ubuntu Core installation.

Click on the Personal details to fill out your information.
Generate an RSA key

Use an existing Linux system to generate the RSA key by running ssh-keygen-trsa on the Linux shell:

Image may be NSFW.
Clik here to view. How to generate an SSH key on the Linux shell

Figure 1: Generate an SSH key on the Linux shell

Your public key is now available as .ssh/id_rsa.pub in your home folder /home/Ubuntu/.ssh/id_rsa.pub.

Click on the SSH keys and insert the contents of your public key /home/Ubuntu/.ssh/id_rsa.pub.

Image may be NSFW.
Clik here to view. Submitted the SSH keys successfully

Figure 2: Submitted the SSH keys successfully

Updating the BIOS

The Gateway should have its BIOS updated to the latest version. To check your Gateway BIOS version:

Go to Start -> Run -> type “msinfo32.exe” or
Turn on your Gateway and press F12 to enter the BIOS

Visit http://www.intel.com/content/www/us/en/support/boards-and-kits/000005850.html to download the latest BIOS version and for instructions on how to install.

Create a Live USB Ubuntu* Flash Drive

Go to http://releases.ubuntu.com/16.04.2/ubuntu-16.04.2-desktop-amd64.iso and download the Ubuntu Desktop 16.04.2 LTS image
Visit https://rufus.akeo.ie and download the Rufus USB installer. Using the Rufus USB installer to put the Ubuntu image onto the USB flash drive is one of the methods.
Follow instructions in this link https://www.ubuntu.com/download/desktop/create-a-usb-stick-on-windows to create a live USB Ubuntu flash drive.

Booting from the Live USB Flash Drive

Connect the USB hub, keyboard, mouse and the monitor to the Gateway GB-BXTB-3825.

Image may be NSFW.
Clik here to view. Gateway GB-BXTB-3825

Figure 3: Gateway GB-BXTB-3825

Insert the Live USB Ubuntu Desktop flash drive you created earlier in to the Gateway GB-BXTB-3825.
Turn on your Gateway GB-BXTB-3825 and press F12 on the key board to enter the boot menu.
Select the USB flash drive as a boot option.

Image may be NSFW.
Clik here to view. BIOS - Select boot device

Figure 4: Select boot device

Select "Try Ubuntu without installing”.

Image may be NSFW.
Clik here to view. Try Ubuntu without installing

Figure 5: Try Ubuntu without installing

**Install Ubuntu* Core Image**

Insert the second USB flash drive containing the Ubuntu Core image file.
Open a terminal and type:

xzcat /media/ubuntu/<name of the second USB flash drive>/ubuntu-core-16-amd64.img.xz | sudo dd of=/dev/sda bs=32M status=progress; sync

Image may be NSFW.
Clik here to view. Flash Ubuntu Core

Figure 6: Flash Ubuntu Core

Reboot the Gateway. The Gateway will reboot from the internal memory where the Ubuntu Core has been flashed.

Configure the Gateway

After the Gateway has rebooted, you will see a prompt to Press enter to configure.
Select start to configure your network. Below is an example of the network configuration.

Image may be NSFW.
Clik here to view. Configure IPv4

Figure 7: Configure IPv4

Image may be NSFW.
Clik here to view. After network configuration

Figure 8: After network configuration

Enter the Ubuntu One email address that was set up earlier.

Image may be NSFW.
Clik here to view. Profile setup

Figure 9: Profile setup

Image may be NSFW.
Clik here to view. Configuration complete

Figure 10: Configuration complete

First User login

First, add RSA identities to the authentication agent by running ssh-add on the shell.

Image may be NSFW.
Clik here to view. ssh-add command

Figure 11: ssh-add command

Next, login with SSH into the Ubuntu Core from a different machine on the same network. The password is not required.

Image may be NSFW.
Clik here to view. ssh into Ubuntu Core

Figure 12: ssh into Ubuntu Core

Set a password in case you want to login from the local console on the IOT Gateway.

Image may be NSFW.
Clik here to view. Set a password

Figure 13: Set a password

Run Hello World Snap on LocalHost

Now the Gateway is ready for the snaps. Snaps are self-contained application bundles that contain most of the libraries and runtimes needed. It is a squashFS filesystem containing your app code and a snap.yaml file.

Image may be NSFW.
Clik here to view. Sign in to a snap store

Install the Hello Snap using the snap name:

Image may be NSFW.
Clik here to view. Install hello snap

Figure 15: Install hello snap

Run the Hello Snap:

Image may be NSFW.
Clik here to view. Run hello snap

Figure 16: Run hello snap

Summary

We have described how to install Ubuntu* Core on an Intel IoT Gateway GB-BXTB-3825 and also how to run the Hello World Snap. Visit https://snapcraft.io/docs/build-snaps to make your own snap and enjoy the power of the Intel® IoT Gateway GB-BXTB-3825.

Key References

Intel® Developer Zone: https://software.intel.com/en-us/iot/home
GB-BXTB-3825 Product overview: http://b2b.gigabyte.com/Embedded-System/GB-BXBT-3825-rev-10#ov
GB-BXTB-3825 Product: https://help.ubuntu.com/community/SSH/OpenSSH/Keys
Ubuntu Core: https://www.ubuntu.com/core
Generate an SSH Key: https://help.ubuntu.com/community/SSH/OpenSSH/Keys
Create a bootable USB stick on Windows*: https://www.ubuntu.com/download/desktop/create-a-usb-stick-on-windows
Download Rufus USB Installer: https://rufus.akeo.ie
Get started with the snap command: https://developer.ubuntu.com/core/get-started/intel-nuc
Build snaps: https://snapcraft.io/docs/build-snaps

About the Author

Nancy Le is a software engineer at Intel Corporation in the Software and Services Group working on Intel Atom® processor enabling for Intel® IoT projects.

↧

Making Smart Systems Smarter

June 6, 2017, 1:51 pm

Latest and popular articles on Intel Technologies

≫ Next: What's New? Intel® Threading Building Blocks 2017 Update 7

≪ Previous: Getting Started with Ubuntu* Core on an Intel® IoT Gateway

The Internet of Things is growing exponentially and shows no signs of stopping. We’ve all heard the statistics. By 2020, there will be billions of connected devices. And these devices are generating explosive volumes of data. The good news is that IoT devices are transforming the way we do business and helping to create a safer, more efficient, more innovative future. The bad news? If all that data has to be transported to the cloud for processing and analyzing, networks will be massively overwhelmed — leading us to a world of delayed results.

But not bad news for long. Smart, connected “things” are getting smarter. And a more intelligent edge is becoming capable of doing more complex analytics. In fact, IDC predicts “by 2019, at least 40 percent of IoT-created data will be stored, processed, analyzed, and acted upon close to or at the edge of the network associated drivers.”¹

Think of a factory using smart devices to monitor the work environment and send notifications if conditions reach unsafe levels. The devices can sense the issue and process the data—enabling time-critical alerts so workers can be evacuated safely. Think of an oil or mining company using sensors and edge computing to monitor conditions and control heavy equipment to automate and optimize operations in remote locations. These are just two examples of how an IoT system with more compute power at the edge can speed the movement of data from insight to action.

Intel is driving smarter devices and a more intelligent edge with cloud and edge technologies that drive greater business value, edge analytics, deep learning and machine learning capabilities.

To further these efforts, we have created a joint reference architecture with Amazon Web Services (AWS)* and are launching an Enterprise IoT Developer Kit. The architecture incorporates AWS Greengrass software, which works seamlessly with the Intel® IoT Platform. With Greengrass, Intel® devices can ingest, process, and store data locally and make local decisions, paired with leading cloud-based capabilities.

The developer kit provides integration of sensors and the middleware protocol stack, shortening time-to-market from prototyping to deployment. The kit includes: Intel® IoT Gateway Technology, AWS* IoT and Greengrass-specific plugins, development boards and starter kits, IDEs to support a variety of programming languages, libraries to support I/O and sensor interactions, documentation, and code samples.

Intel has been implementing the reference architecture across their ecosystem, working with IoT Equipment Builders for validation and testing to ensure compatibility. There are already a number of Intel-based IoT Gateways validated for Greengrass, provided by leading IoT Equipment Builders, who are members of the IoT Solutions Alliance. With the backing of an extended IoT community, these and other IoT solutions are driving real-time insights to help businesses increase revenue, cut costs, and improve business outcomes.

Smarter things that are part of a more intelligent edge will fuel the growth of the IoT and its capability to bring meaning to data and benefits to business. To learn more, visit intel.com/iot/aws.

¹IDC FutureScape: Worldwide Internet of Things 2017 Predictions, November 2016.

↧

What's New? Intel® Threading Building Blocks 2017 Update 7

June 7, 2017, 1:14 am

Latest and popular articles on Intel Technologies

≫ Next: Intel® Advisor 2017 Update 4: What’s New

≪ Previous: Making Smart Systems Smarter

The updated version contains a new bug fix when compared to the previous Intel® Threading Building Blocks (Intel® TBB) 2017 Update 6 release. Information about new features of previous release you can find under the following link.

What's new in Intel® Threading Building Blocks (Intel® TBB) 2017 Update 6

Added functionality:

In the huge pages mode, the memory allocator now is also able to use transparent huge pages.

Preview Features:

Added support for Intel TBB integration into CMake-aware projects, with valuable guidance and feedback provided by Brad King (Kitware).

Bugs fixed:

Fixed scalable_allocation_command(TBBMALLOC_CLEAN_ALL_BUFFERS, 0) to process memory left after exited threads.

Intel TBB 2017 Update 7 is open source only release, you can download it from https://github.com/01org/tbb/releases.

↧