Saturday, December 12, 2020

WebXR device API

Like the WebVR spec before it, the WebXR Device API is a product of the Immersive Web Community Group which has contributors from Google, Microsoft, Mozilla, and others. The 'X in XR is intended as a sort of algebraic variable that stands for anything in the spectrum of immersive experiences. It's available in the previously mentioned origin trial as well as through a polyfill.


The WebXR Device API does not provide image rendering features. That's up to you. Drawing is done using WebGL APIs. 

Three.js has supported WebXR since May. I've heard nothing about A-Frame.


Starting and running an app

The basic process is this:


Request an XR device.

If it's available, request an XR session. If you want the user to put their phone in a headset, it's called an immersive session and requires a user gesture to enter.

Use the session to run a render loop which provides 60 image frames per second. Draw appropriate content to the screen in each frame.

Run the render loop until the user decides to exit.

End the XR session.

Let's look at this in a little more detail and include some code. You won't be able to run an app from what I'm about to show you. But again, this is just to give a sense of it.


Starting and running an app

The basic process is this:


Request an XR device.

If it's available, request an XR session. If you want the user to put their phone in a headset, it's called an immersive session and requires a user gesture to enter.

Use the session to run a render loop which provides 60 image frames per second. Draw appropriate content to the screen in each frame.

Run the render loop until the user decides to exit.

End the XR session.

Let's look at this in a little more detail and include some code. You won't be able to run an app from what I'm about to show you. But again, this is just to give a sense of it.




Request an XR device

If you're not using an immersive session you can skip advertising the functionality and getting a user gesture and go straight to requesting a session. An immersive session is one that requires a headset. A non-immersive session simply shows content on the device screen. The former is what most people think of when you refer to virtual reality or augmented reality. The latter is sometimes called a 'magic window'.





if (navigator.xr) {

  navigator.xr.requestDevice()

  .then(xrDevice => {

    // Advertise the AR/VR functionality to get a user gesture.

  })

  .catch(err => {

    if (err.name === 'NotFoundError') {

      // No XRDevices available.

      console.error('No XR devices available:', err);

    } else {

      // An error occurred while requesting an XRDevice.

      console.error('Requesting XR device failed:', err);

    }

  })

} else{

  console.log("This browser does not support the WebXR API.");

}


Request an XR session

Now that we have our device and our user gesture, it's time to get a session. To create a session, the browser needs a canvas on which to draw.





xrPresentationContext = htmlCanvasElement.getContext('xrpresent');

let sessionOptions = {

  // The immersive option is optional for non-immersive sessions; the value

  //   defaults to false.

  immersive: false,

  outputContext: xrPresentationContext

}

xrDevice.requestSession(sessionOptions)

.then(xrSession => {

  // Use a WebGL context as a base layer.

  xrSession.baseLayer = new XRWebGLLayer(session, gl);

  // Start the render loop

})


Run the render loop

The code for this step takes a bit of untangling. To untangle it, I'm about to throw a bunch of words at you. If you want a peek at the final code, jump ahead to have a quick look then come back for the full explanation. There's quite a bit that you may not be able to infer.


The basic process for a render loop is this:


Request an animation frame.

Query for the position of the device.

Draw content to the position of the device based on it's position.

Do work needed for the input devices.

Repeat 60 times a second until the user decides to quit.





Request a presentation frame

The word 'frame' has several meanings in a Web XR context. The first is the frame of reference which defines where the origin of the coordinate system is calculated from, and what happens to that origin when the device moves. (Does the view stay the same when the user moves or does it shift as it would in real life?)


The second type of frame is the presentation frame, represented by an XRFrame object. This object contains the information needed to render a single frame of an AR/VR scene to the device. This is a bit confusing because a presentation frame is retrieved by calling requestAnimationFrame(). This makes it compatible with window.requestAnimationFrame().


xrSession.requestFrameOfReference('eye-level')

.then(xrFrameOfRef => {

  xrSession.requestAnimationFrame(onFrame(time, xrFrame) {

    // The time argument is for future use and not implemented at this time.

    // Process the frame.

    xrFrame.session.requestAnimationFrame(onFrame);

  }

});



Poses

Before drawing anything to the screen, you need to know where the display device is pointing and you need access to the screen. In general, the position and orientation of a thing in AR/VR is called a pose. Both viewers and input devices have a pose. (I cover input devices later.) Both viewer and input device poses are defined as a 4 by 4 matrix stored in a Float32Array in column major order. You get the viewer's pose by calling XRFrame.getDevicePose() on the current animation frame object. Always test to see if you got a pose back. If something went wrong you don't want to draw to the screen.


let pose = xrFrame.getDevicePose(xrFrameOfRef);

if (pose) {

  // Draw something to the screen.

}


Views


After checking the pose, it's time to draw something. The object you draw to is called a view (XRView). This is where the session type becomes important. Views are retrieved from the XRFrame object as an array. If you're in a non-immersive session the array has one view. If you're in an immersive session, the array has two, one for each eye.



for (let view of xrFrame.views) {

  // Draw something to the screen.

}



Below is how it looks like altogether 


xrSession.requestFrameOfReference('eye-level')

.then(xrFrameOfRef => {

  xrSession.requestAnimationFrame(onFrame(time, xrFrame) {

    // The time argument is for future use and not implemented at this time.

    let pose = xrFrame.getDevicePose(xrFrameOfRef);

    if (pose) {

      for (let view of xrFrame.views) {

        // Draw something to the screen.

      }

    }

    // Input device code will go here.

    frame.session.requestAnimationFrame(onFrame);

  }

}


Ending XR session 


xrDevice.requestSession(sessionOptions)

.then(xrSession => {

  // Create a WebGL layer and initialize the render loop.

  xrSession.addEventListener('end', onSessionEnd);

});


// Restore the page to normal after immersive access has been released.

function onSessionEnd() {

  xrSession = null;


  // Ending the session stops executing callbacks passed to the XRSession's

  // requestAnimationFrame(). To continue rendering, use the window's

  // requestAnimationFrame() function.

  window.requestAnimationFrame(onDrawFrame);

}


How does interaction work?



The WebXR Device API adopts a "point and click" approach to user input. With this approach every input source has a defined pointer ray to indicate where an input device is pointing and events to indicate when something was selected. Your app draws the pointer ray and shows where it's pointed. When the user clicks the input device, events are fired—select, selectStart, and selectEnd, specifically. Your app determines what was clicked and responds appropriately.


To users, the pointer ray is just a faint line between the controller and whatever they're pointing at. But your app has to draw it. That means getting the pose of the input device and drawing a line from its location to an object in AR/VR space. That process looks roughly like this:


let inputSources = xrSession.getInputSources();

for (let xrInputSource of inputSources) {

  let inputPose = frame.getInputPose(inputSource, xrFrameOfRef);

  if (!inputPose) {

    continue;

  }

  if (inputPose.gripMatrix) {

    // Render a virtual version of the input device

    //   at the correct position and orientation.

  }

  if (inputPose.pointerMatrix) {

    // Draw a ray from the gripMatrix to the pointerMatrix.

  }

}


References:

https://developers.google.com/web/updates/2018/05/welcome-to-immersive

No comments:

Post a Comment