-
-
Notifications
You must be signed in to change notification settings - Fork 605
Getting Started
🚧
- Preparation
-
Tutorial
- Hello World!
- Official Solution
This plugin requires native libraries (e.g. libmediapipe_c.so, mediapipe_c.dll, mediapipe_android.aar, etc...) to work, but they are not included in the repository.
If you've not built them yet, go to https://github.com/homuler/MediaPipeUnityPlugin/wiki/Installation-Guide first.
Before using the plugin in your project, it's strongly recommended that you check if it works in this project.
First, open Assets/MediaPipeUnity/Samples/Scenes/Start Scene.unity.

And play the scene.
If you've built the plugin successfully, the Face Detection sample will start after a while.

Once you've built the plugin, you can import it into your project. Choose your favorite method from the following options.
-
Open this project
-
Click
Tools > Export Unitypackage
-
MediaPipeUnity.[version].unitypackagefile will be created at the project root.
-
-
Open your project
- Install
npmcommand - Build a tarball file
cd Packages/com.github.homuler.mediapipe npm pack # com.github.homuler.mediapipe-[version].tgz will be created mv com.github.homuler.mediapipe-[version].tgz your/favorite/path
- Install the package from the tarball file
⚠️ Development with Git submodules tends to be a bit more complicated.
- Add a submodule
mkdir Submodules cd Submodules git submodule add https://github.com/homuler/MediaPipeUnityPlugin - Build the plugin
cd MediaPipeUnityPlugin python build.py build ... - Install the package from
Submodules/MediaPipeUnityPlugin/Packages/com.github.homuler.mediapipe
⚠️ If you are not familiar with MediaPipe, you may want to read the Framework Concepts article first.
Let's write our first program!
🔔 The following code is based on mediapipe/examples/desktop/examples/hello_world/hello_world.cc.
To run the Calculators provided by MediaPipe, we usually need to initialize a CalculatorGraph, so let's do that first!
🔔 Each
CalculatorGraphhas its own config (CalculatorGraphConfig).
var configText = @"
input_stream: ""in""
output_stream: ""out""
node {
calculator: ""PassThroughCalculator""
input_stream: ""in""
output_stream: ""out1""
}
node {
calculator: ""PassThroughCalculator""
input_stream: ""out1""
output_stream: ""out""
}
";
var graph = new CalculatorGraph(configText);To run a CalculatorGraph, call the StartRun method.
graph.StartRun().AssertOk();Note that the StartRun method returns a Status object, which represents the result.
Status#AssertOk throws iff the result is not OK.
After starting, of course we want to give inputs to the CalculatorGraph, right?
Let's say we want to give a sequence of 10 strings ("Hello World!") as input.
for (var i = 0; i < 10; i++)
{
// Send input to running graph
}In MediaPipe, input is passed through a class called Packet.
var input = new StringPacket("Hello World!");To pass an input Packet to the CalculatorGraph, we can use CalculatorGraph#AddPacketToInputStream.
Note that the only input stream name of this CalculatorGraph is in.
🔔 It depends on the
CalculatorGraphConfig.CalculatorGraphcan multiple input streams
for (var i = 0; i < 10; i++)
{
var input = new StringPacket("Hello World!");
graph.AddPacketToInputStream("in", input).AssertOk();
}CalculatorGraph#AddPacketToInputStream also returns a Status object, so let's call AssertOk here as well.
After everything is done, we should
- close input streams
- dispose of the
CalculatorGraph
so let's do that.
Again, note that each method returns a Status object.
graph.CloseInputStream("in").AssertOk();
graph.WaitUntilDone().AssertOk();
graph.Dispose();For now, let's just run the code we've written so far.
Save the following code as HelloWorld.cs, attach it to an empty GameObject and play the scene.
using UnityEngine;
namespace Mediapipe.Unity.Tutorial
{
public class HelloWorld : MonoBehaviour
{
private void Start()
{
var configText = @"
input_stream: ""in""
output_stream: ""out""
node {
calculator: ""PassThroughCalculator""
input_stream: ""in""
output_stream: ""out1""
}
node {
calculator: ""PassThroughCalculator""
input_stream: ""out1""
output_stream: ""out""
}
";
var graph = new CalculatorGraph(configText);
graph.StartRun().AssertOk();
for (var i = 0; i < 10; i++)
{
var input = new StringPacket("Hello World!");
graph.AddPacketToInputStream("in", input).AssertOk();
}
graph.CloseInputStream("in").AssertOk();
graph.WaitUntilDone().AssertOk();
graph.Dispose();
Debug.Log("Done");
}
}
}
Oops, I see an error.
MediaPipeException: INVALID_ARGUMENT: Graph has errors:
; In stream "in", timestamp not specified or set to illegal value: Timestamp::Unset()
at Mediapipe.Status.AssertOk () [0x00014] in /home/homuler/Development/unity/MediaPipeUnityPlugin/Packages/com.github.homuler.mediapipe/Runtime/Scripts/Framework/Port/Status.cs:50
at Mediapipe.Unity.Tutorial.HelloWorld.Start () [0x00025] in /home/homuler/Development/unity/MediaPipeUnityPlugin/Assets/MediaPipeUnity/Tutorial/Hello World/HelloWorld.cs:35 Each input packet should have a timestamp, but it does not appear to be set.
Let's fix the code that initializes a Packet as follows.
// var input = new StringPacket("Hello World!");
var input = new StringPacket("Hello World!", new Timestamp(i));
This time it seems to work.
But wait, we are not receiving the CalculatorGraph output!
To get output, we need to do more work before running the CalculatorGraph.
Note that the only output stream name of this CalculatorGraph is out.
🔔 It depends on the
CalculatorGraphConfig.CalculatorGraphcan multiple output streams.
var graph = new CalculatorGraph(configText);
// Initialize an `OutputStreamPoller`.
// NOTE: The type parameter is `string` since the output type is `string`.
var poller = graph.AddOutputStreamPoller<string>("out").Value();
graph.StartRun().AssertOk();CalculatorGraph#AddOutputStreamPoller<T> returns a StatusOr<T> object.
StatusOr<T> is similar to Status, but it can contain a value if the Status is OK.
🔔 In production, you should check if it's OK before calling
StatusOr<V>#Value.var statusOrPoller = graph.AddOutputStreamPoller<string>("out"); if (statusOrPoller.Ok()) { var poller = statusOrPoller.Value(); }
Then, we can get output using the OutputStreamPoller<string>#Next.
Like inputs, outputs must be received through packets.
graph.CloseInputStream("in").AssertOk();
// Initialize an empty packet
var output = new StringPacket();
while (poller.Next(output))
{
Debug.Log(output.Get());
}
graph.WaitUntilDone().AssertOk();Now, our code would look like this.
using UnityEngine;
namespace Mediapipe.Unity.Tutorial
{
public class HelloWorld : MonoBehaviour
{
private void Start()
{
var configText = @"
input_stream: ""in""
output_stream: ""out""
node {
calculator: ""PassThroughCalculator""
input_stream: ""in""
output_stream: ""out1""
}
node {
calculator: ""PassThroughCalculator""
input_stream: ""out1""
output_stream: ""out""
}
";
var graph = new CalculatorGraph(configText);
var poller = graph.AddOutputStreamPoller<string>("out").Value();
graph.StartRun().AssertOk();
for (var i = 0; i < 10; i++)
{
var input = new StringPacket("Hello World!", new Timestamp(i));
graph.AddPacketToInputStream("in", input).AssertOk();
}
graph.CloseInputStream("in").AssertOk();
var output = new StringPacket();
while (poller.Next(output))
{
Debug.Log(output.Get());
}
graph.WaitUntilDone().AssertOk();
graph.Dispose();
Debug.Log("Done");
}
}
}
What happens if the config format is invalid?
var graph = new CalculatorGraph("invalid format");
Hmm, the constructor fails, which is probably the behavior it should be.
Let's check Editor.log.
[libprotobuf ERROR external/com_google_protobuf/src/google/protobuf/text_format.cc:335] Error parsing text-format mediapipe.CalculatorGraphConfig: 1:9: Message type "mediapipe.CalculatorGraphConfig" has no field named "invalid".
MediaPipeException: Failed to parse config text. See error logs for more details
at Mediapipe.CalculatorGraphConfigExtension.ParseFromTextFormat (Google.Protobuf.MessageParser`1[T] _, System.String configText) [0x0001e] in /home/homuler/Development/unity/MediaPipeUnityPlugin/Packages/com.github.homuler.mediapipe/Runtime/Scripts/Framework/CalculatorGraphConfigExtension.cs:21
at Mediapipe.CalculatorGraph..ctor (System.String textFormatConfig) [0x00000] in /home/homuler/Development/unity/MediaPipeUnityPlugin/Packages/com.github.homuler.mediapipe/Runtime/Scripts/Framework/CalculatorGraph.cs:33
at Mediapipe.Unity.Tutorial.HelloWorld.Start () [0x00000] in /home/homuler/Development/unity/MediaPipeUnityPlugin/Assets/MediaPipeUnity/Tutorial/Hello World/HelloWorld.cs:31 Not too bad, but it's inconvenient to check Editor.log every time.
Let's fix it so that the logs are visible in the Console Window.
Protobuf.SetLogHandler(Protobuf.DefaultLogHandler);
var graph = new CalculatorGraph("invalid format");
Great!
But there's a minor but serious bug that can cause SIGSEGV.
Don't forget to restore the default LogHandler when the application exits.
void OnApplicationQuit()
{
Protobuf.ResetLogHandler();
}In this section, let's try running the Face Mesh Solution.
First, let's display the Web Camera image on the screen.
using System.Collections;
using UnityEngine;
using UnityEngine.UI;
namespace Mediapipe.Unity.Tutorial
{
public class FaceMesh : MonoBehaviour
{
[SerializeField] private TextAsset _configAsset;
[SerializeField] private RawImage _screen;
[SerializeField] private int _width;
[SerializeField] private int _height;
[SerializeField] private int _fps;
private WebCamTexture _webCamTexture;
private IEnumerator Start()
{
if (WebCamTexture.devices.Length == 0)
{
throw new System.Exception("Web Camera devices are not found");
}
var webCamDevice = WebCamTexture.devices[0];
_webCamTexture = new WebCamTexture(webCamDevice.name, _width, _height, _fps);
_webCamTexture.Play();
yield return new WaitUntil(() => _webCamTexture.width > 16);
_screen.rectTransform.sizeDelta = new Vector2(_width, _height);
_screen.texture = _webCamTexture;
while (true)
{
yield return new WaitForEndOfFrame();
}
}
private void OnDestroy()
{
if (_webCamTexture != null)
{
_webCamTexture.Stop();
}
}
}
}If everything is fine, your screen will look like this.

Now let's try face_mesh_desktop_live.pbtxt, the official Face Mesh sample!
⚠️ To run the graph, you must build native libraries with GPU disabled.
First, initialize a CalculatorGraph as in the Hello World example.
var graph = new CalculatorGraph(_configAsset.text);
graph.StartRun().AssertOk();In MediaPipe, image data on the CPU is stored in a class called ImageFrame.
Let's initialize an ImageFrame instance from the WebCamTexture image.
💡 On the other hand, image data on the GPU is stored in a class called
GpuBuffer.
We can initialize an ImageFrame instance using NativeArray<byte>.
Here, although not the best from the perspective of the performance, we will copy the WebCamTexture data to Texture2D to obtain a NativeArray<byte>.
Texture2D inputTexture = new Texture2D(_width, _height, TextureFormat.RGBA32, false);
Color32[] pixelData = new Color32[_width * _height];
while (true)
{
inputTexture.SetPixels32(_webCamTexture.GetPixels32(pixelData));
yield return new WaitForEndOfFrame();
}Now we can initialize an ImageFrame instance using inputTexture.
⚠️ In theory, you can buildImageFrameinstances using various formats, but not allCalculators necessarily support all formats. As for official solutions, they often work only with RGBA32 format.
var imageFrame = new ImageFrame(ImageFormat.Types.Format.Srgba, _width, _height, _width * 4, inputTexture.GetRawTextureData<byte>());The 4th argument, widthStep, may require some explanation.
It's the byte offset between a pixel value and the same pixel and channel in the next row.
In most cases, this is equal to the product of the width and the number of channels.
As usual, initialize a Packet and send it to the CalculatorGraph.
Note that the input stream name is "input_video" and the input type is ImageFrame this time.
graph.AddPacketToInputStream("input_video", new ImageFramePacket(imageFrame)).AssertOk();We should stop the CalculatorGraph on the OnDestroy event.
With a little refactoring, the code now looks like this.
using System.Collections;
using UnityEngine;
using UnityEngine.UI;
namespace Mediapipe.Unity.Tutorial
{
public class FaceMesh : MonoBehaviour
{
[SerializeField] private TextAsset _configAsset;
[SerializeField] private RawImage _screen;
[SerializeField] private int _width;
[SerializeField] private int _height;
[SerializeField] private int _fps;
private CalculatorGraph _graph;
private WebCamTexture _webCamTexture;
private Texture2D _inputTexture;
private Color32[] _pixelData;
private IEnumerator Start()
{
if (WebCamTexture.devices.Length == 0)
{
throw new System.Exception("Web Camera devices are not found");
}
var webCamDevice = WebCamTexture.devices[0];
_webCamTexture = new WebCamTexture(webCamDevice.name, _width, _height, _fps);
_webCamTexture.Play();
yield return new WaitUntil(() => _webCamTexture.width > 16);
_screen.rectTransform.sizeDelta = new Vector2(_width, _height);
_screen.texture = _webCamTexture;
_inputTexture = new Texture2D(_width, _height, TextureFormat.RGBA32, false);
_pixelData = new Color32[_width * _height];
_graph = new CalculatorGraph(_configAsset.text);
_graph.StartRun().AssertOk();
while (true)
{
_inputTexture.SetPixels32(_webCamTexture.GetPixels32(_pixelData));
var imageFrame = new ImageFrame(ImageFormat.Types.Format.Srgba, _width, _height, _width * 4, _inputTexture.GetRawTextureData<byte>());
_graph.AddPacketToInputStream("input_video", new ImageFramePacket(imageFrame)).AssertOk();
yield return new WaitForEndOfFrame();
}
}
private void OnDestroy()
{
if (_webCamTexture != null)
{
_webCamTexture.Stop();
}
if (_graph != null)
{
try
{
_graph.CloseInputStream("input_video").AssertOk();
_graph.WaitUntilDone().AssertOk();
}
finally
{
_graph.Dispose();
}
}
}
}
}Let's play the scene!

Well, it's not so easy, is it?
MediaPipeException: INVALID_ARGUMENT: Graph has errors: Calculator::Open() for node "facelandmarkfrontcpu__facelandmarkcpu__facelandmarksmodelloader__LocalFileContentsCalculator" failed: ; Can't find file: mediapipe/modules/face_landmark/face_landmark_with_attention.tflite
It looks like LocalFileContentsCalculator failed to load face_landmark_with_attention.tflite.
In the next section, we will resolve this error.
⚠️ If you get error messages like the following, go to [...].F20220418 11:58:05.626176 230087 calculator_graph.cc:126] Non-OK-status: Initialize(config) status: NOT_FOUND: ValidatedGraphConfig Initialization failed. No registered object with name: FaceLandmarkFrontCpu; Unable to find Calculator "FaceLandmarkFrontCpu" No registered object with name: FaceRendererCpu; Unable to find Calculator "FaceRendererCpu"
To load model files on Unity, we need to resolve their paths because they are hardcoded.
Not only that, we even need to save the file in a specific path because some calculators are written to read dependent resources from the file system.
💡 The path to save is not fixed since we can translate each model path into an arbitrary path.
But don't worry. In most cases, all you need to do is initialize a ResourceManager class and call the PrepareAssetAsync method in advance.
💡
PrepareAssetAsyncmethod will save the specified file underApplication.persistentDataPath.
For testing purposes, the LocalResourceManager class is sufficient.
var resourceManager = new LocalResourceManager();
yield return resourceManager.PrepareAssetAsync("dependent_asset_name");In development / production, you can choose either StreamingAssetResourceManager or AssetBundleResourceManager.
For example, StreamingAssetResourceManager will load model files from Application.streamingAssetsPath.
// NOTE: Dependent assets must be placed under `Assets/StreamingAssets`.
var resourceManager = new StreamingAssetsResourceManager();
yield return resourceManager.PrepareAssetAsync("dependent_asset_name");
⚠️ ResourceManagerclass can be initialized only once. In other words, you cannot use bothStreamingAssetResourceManagerandAssetBundleResourceManagerin one application.
Now, let's get back to the code.
After trial and error, we find that we need to prepare files face_detection_short_range.tflite and face_landmark_with_attention.tflite.
Unity does not support .tflite extension, so this plugin adopts the .bytes extension instead.
Now the entire code will look like this.
using System.Collections;
using UnityEngine;
using UnityEngine.UI;
namespace Mediapipe.Unity.Tutorial
{
public class FaceMesh : MonoBehaviour
{
[SerializeField] private TextAsset _configAsset;
[SerializeField] private RawImage _screen;
[SerializeField] private int _width;
[SerializeField] private int _height;
[SerializeField] private int _fps;
private CalculatorGraph _graph;
private ResourceManager _resourceManager;
private WebCamTexture _webCamTexture;
private Texture2D _inputTexture;
private Color32[] _pixelData;
private IEnumerator Start()
{
if (WebCamTexture.devices.Length == 0)
{
throw new System.Exception("Web Camera devices are not found");
}
var webCamDevice = WebCamTexture.devices[0];
_webCamTexture = new WebCamTexture(webCamDevice.name, _width, _height, _fps);
_webCamTexture.Play();
yield return new WaitUntil(() => _webCamTexture.width > 16);
_screen.rectTransform.sizeDelta = new Vector2(_width, _height);
_screen.texture = _webCamTexture;
_inputTexture = new Texture2D(_width, _height, TextureFormat.RGBA32, false);
_pixelData = new Color32[_width * _height];
_resourceManager = new LocalResourceManager();
yield return _resourceManager.PrepareAssetAsync("face_detection_short_range.bytes");
yield return _resourceManager.PrepareAssetAsync("face_landmark_with_attention.bytes");
_graph = new CalculatorGraph(_configAsset.text);
_graph.StartRun().AssertOk();
while (true)
{
_inputTexture.SetPixels32(_webCamTexture.GetPixels32(_pixelData));
var imageFrame = new ImageFrame(ImageFormat.Types.Format.Srgba, _width, _height, _width * 4, _inputTexture.GetRawTextureData<byte>());
_graph.AddPacketToInputStream("input_video", new ImageFramePacket(imageFrame)).AssertOk();
yield return new WaitForEndOfFrame();
}
}
private void OnDestroy()
{
if (_webCamTexture != null)
{
_webCamTexture.Stop();
}
if (_graph != null)
{
try
{
_graph.CloseInputStream("input_video").AssertOk();
_graph.WaitUntilDone().AssertOk();
}
finally
{
_graph.Dispose();
}
}
}
}
}What will be the result this time...?

Oops, once again I forgot to set the timestamp.
But what value should I set for the timestamp this time?
In the Hello World example, the loop variable i was set to the value of Timestamp.
In practice, however, MediaPipe assumes that the value of Timestamp is in microseconds (cf.mediapipe/framework/timestamp.h).
🔔 There are calculators that care about the absolute value of the
Timestamp, which causes unintended behavior when used if the value is not in microseconds.
Let's initialize a Timestamp with a microsecond value from the start.
using Stopwatch = System.Diagnostics.Stopwatch;
var stopwatch = new Stopwatch();
stopwatch.Start();
var currentTimestamp = stopwatch.ElapsedTicks / (System.TimeSpan.TicksPerMillisecond / 1000);
var timestamp = new Timestamp(currentTimestamp);And the entire code:
using System.Collections;
using UnityEngine;
using UnityEngine.UI;
using Stopwatch = System.Diagnostics.Stopwatch;
namespace Mediapipe.Unity.Tutorial
{
public class FaceMesh : MonoBehaviour
{
[SerializeField] private TextAsset _configAsset;
[SerializeField] private RawImage _screen;
[SerializeField] private int _width;
[SerializeField] private int _height;
[SerializeField] private int _fps;
private CalculatorGraph _graph;
private ResourceManager _resourceManager;
private WebCamTexture _webCamTexture;
private Texture2D _inputTexture;
private Color32[] _pixelData;
private IEnumerator Start()
{
if (WebCamTexture.devices.Length == 0)
{
throw new System.Exception("Web Camera devices are not found");
}
var webCamDevice = WebCamTexture.devices[0];
_webCamTexture = new WebCamTexture(webCamDevice.name, _width, _height, _fps);
_webCamTexture.Play();
yield return new WaitUntil(() => _webCamTexture.width > 16);
_screen.rectTransform.sizeDelta = new Vector2(_width, _height);
_screen.texture = _webCamTexture;
_inputTexture = new Texture2D(_width, _height, TextureFormat.RGBA32, false);
_pixelData = new Color32[_width * _height];
_resourceManager = new LocalResourceManager();
yield return _resourceManager.PrepareAssetAsync("face_detection_short_range.bytes");
yield return _resourceManager.PrepareAssetAsync("face_landmark_with_attention.bytes");
var stopwatch = new Stopwatch();
_graph = new CalculatorGraph(_configAsset.text);
_graph.StartRun().AssertOk();
stopwatch.Start();
while (true)
{
_inputTexture.SetPixels32(_webCamTexture.GetPixels32(_pixelData));
var imageFrame = new ImageFrame(ImageFormat.Types.Format.Srgba, _width, _height, _width * 4, _inputTexture.GetRawTextureData<byte>());
var currentTimestamp = stopwatch.ElapsedTicks / (System.TimeSpan.TicksPerMillisecond / 1000);
_graph.AddPacketToInputStream("input_video", new ImageFramePacket(imageFrame, new Timestamp(currentTimestamp))).AssertOk();
yield return new WaitForEndOfFrame();
}
}
private void OnDestroy()
{
if (_webCamTexture != null)
{
_webCamTexture.Stop();
}
if (_graph != null)
{
try
{
_graph.CloseInputStream("input_video").AssertOk();
_graph.WaitUntilDone().AssertOk();
}
finally
{
_graph.Dispose();
}
}
}
}
}
Now, it seems to be working.
But of course, we want to receive output next.
In the Hello World example, we initialized OutputStreamPoller using CalculatorGraph#AddOutputStreamPoller.
This time, to handle output more easily, let's use the OutputStream API provided by the plugin instead!
var graph = new CalculatorGraph(_configAsset.text);
var outputVideoStream = new OutputStreasm<ImageFramePacket, ImageFrame>(graph, "output_video");This may sound a bit tedious, but both Packet type and value type of output must be specified.
And before running the CalculatorGraph, call StartPolling.
// NOTE: StartPolling returns Status
outputVideoStream.StartPolling().AssertOk();
_graph.StartRun().AssertOk();To get the next output, call TryNext.
It returns true if the next output is retrieved successfully.
if (outputVideoStream.TryGetNext(out var outputVideo))
{
// ...
}This time, let's display the output image directly on the screen.
We can read the pixel data using ImageFrame#TryReadPixelData.
// NOTE: TryReadPixelData is implemented in `Mediapipe.Unity.ImageFrameExtension`.
// using Mediapipe.Unity;
var outputTexture = new Texture2D(_width, _height, TextureFormat.RGBA32, false);
var outputPixelData = new Color32[_width * _height];
_screen.texture = outputTexture;
if (outputVideoStream.TryGetNext(out var outputVideo))
{
if (outputVideo.TryReadPixelData(outputPixelData))
{
outputTexture.SetPixels32(outputPixelData);
outputTexture.Apply();
}
}Now our code should look something like this.
using System.Collections;
using UnityEngine;
using UnityEngine.UI;
using Stopwatch = System.Diagnostics.Stopwatch;
namespace Mediapipe.Unity.Tutorial
{
public class FaceMesh : MonoBehaviour
{
[SerializeField] private TextAsset _configAsset;
[SerializeField] private RawImage _screen;
[SerializeField] private int _width;
[SerializeField] private int _height;
[SerializeField] private int _fps;
private CalculatorGraph _graph;
private ResourceManager _resourceManager;
private WebCamTexture _webCamTexture;
private Texture2D _inputTexture;
private Color32[] _inputPixelData;
private Texture2D _outputTexture;
private Color32[] _outputPixelData;
private IEnumerator Start()
{
if (WebCamTexture.devices.Length == 0)
{
throw new System.Exception("Web Camera devices are not found");
}
var webCamDevice = WebCamTexture.devices[0];
_webCamTexture = new WebCamTexture(webCamDevice.name, _width, _height, _fps);
_webCamTexture.Play();
yield return new WaitUntil(() => _webCamTexture.width > 16);
_screen.rectTransform.sizeDelta = new Vector2(_width, _height);
_inputTexture = new Texture2D(_width, _height, TextureFormat.RGBA32, false);
_inputPixelData = new Color32[_width * _height];
_outputTexture = new Texture2D(_width, _height, TextureFormat.RGBA32, false);
_outputPixelData = new Color32[_width * _height];
_screen.texture = _outputTexture;
_resourceManager = new LocalResourceManager();
yield return _resourceManager.PrepareAssetAsync("face_detection_short_range.bytes");
yield return _resourceManager.PrepareAssetAsync("face_landmark_with_attention.bytes");
var stopwatch = new Stopwatch();
_graph = new CalculatorGraph(_configAsset.text);
var outputVideoStream = new OutputStream<ImageFramePacket, ImageFrame>(_graph, "output_video");
outputVideoStream.StartPolling().AssertOk();
_graph.StartRun().AssertOk();
stopwatch.Start();
while (true)
{
_inputTexture.SetPixels32(_webCamTexture.GetPixels32(_inputPixelData));
var imageFrame = new ImageFrame(ImageFormat.Types.Format.Srgba, _width, _height, _width * 4, _inputTexture.GetRawTextureData<byte>());
var currentTimestamp = stopwatch.ElapsedTicks / (System.TimeSpan.TicksPerMillisecond / 1000);
_graph.AddPacketToInputStream("input_video", new ImageFramePacket(imageFrame, new Timestamp(currentTimestamp))).AssertOk();
yield return new WaitForEndOfFrame();
if (outputVideoStream.TryGetNext(out var outputVideo))
{
if (outputVideo.TryReadPixelData(_outputPixelData))
{
_outputTexture.SetPixels32(_outputPixelData);
_outputTexture.Apply();
}
}
}
}
private void OnDestroy()
{
if (_webCamTexture != null)
{
_webCamTexture.Stop();
}
if (_graph != null)
{
try
{
_graph.CloseInputStream("input_video").AssertOk();
_graph.WaitUntilDone().AssertOk();
}
finally
{
_graph.Dispose();
}
}
}
}
}Let's try running!

Hmm, it seems to be working, but the top and bottom appear to be reversed.
In Unity, the pixel data is stored from bottom-left to top-right, whereas MediaPipe assumes the pixel data is stored from top-left to bottom-right.
Therefore, if you send the pixel data to MediaPipe as is, MediaPipe will receive an upside-down image.
🔔
ImageFrame#TryReadPixelDataautomatically reads pixels upside down, so the output image is received correctly.
You can flip the input image vertically by yourself, but here we will use ImageTransformationCalculator.
node: {
calculator: "ImageTransformationCalculator"
input_stream: "IMAGE:throttled_input_video"
output_stream: "IMAGE:transformed_input_video"
node_options: {
[type.googleapis.com/mediapipe.ImageTransformationCalculatorOptions] {
flip_vertically: true
}
}
}Don't forget to replace throttled_input_video with transformed_input_video.
# Subgraph that detects faces and corresponding landmarks.
node {
calculator: "FaceLandmarkFrontCpu"
- input_stream: "IMAGE:throttled_input_video"
+ input_stream: "IMAGE:transformed_input_video"
input_side_packet: "NUM_FACES:num_faces"
input_side_packet: "WITH_ATTENTION:with_attention"
output_stream: "LANDMARKS:multi_face_landmarks" # Subgraph that renders face-landmark annotation onto the input image.
node {
calculator: "FaceRendererCpu"
- input_stream: "IMAGE:throttled_input_video"
+ input_stream: "IMAGE:transformed_input_video"
input_stream: "LANDMARKS:multi_face_landmarks"
input_stream: "NORM_RECTS:face_rects_from_landmarks"
input_stream: "DETECTIONS:face_detections"This time it should work correctly.
