Thanks to everyone who submitted a request or provided feedback. It helped motivate us to push this out sooner and helped guide us on what to include in this initial release.
This was built with the Hootsy Unity SDK. The main objective was to allow you to easily view your assistants in AR. If you built an assistant using one of the prebuilt characters, then all you have to do is click on the AR link in your scene after downloading the app and it should automatically open the app and add that assistant to your list.
Some other features include:
Main intro assistant to guide people when they first open the app.
Ability to interact with up to 2 assistants at once. Assistants can either be placed in your world or attached to the screen by tapping the pin icon.
If you run into any issues or have any feedback, please let us know – firstname.lastname@example.org. We’re continuing to improve the performance as well as add more assistants with real-world use cases before launching it in the app store.
Hootsy currently supports built-in integration with Lex and DialogFlow, and custom webhook integration. The custom webhook integration allows you to connect any chatbot service’s messaging API to Hootsy.
Hootsy’s messaging structure closely matches Facebook’s so the custom integration will need to ensure the messaging structure is the same as what is sent to Facebook to ensure it is handled correctly by Hootsy. For more info on the our messaging structure, see the developer docs.
In this example, we’ll convert the DialogFlow v1 API to Facebook’s format and then send it to Hootsy. See sample code on glitch. Feel free to remix this code. Note: this example assumes you’ve already created a chatbot in Dialogflow.
First, we’ll create the assistant in Hootsy. Expand out the Conversation section in the left menu. Set source to Custom, define the webhook URL (in our case: https://hootsy-custom-webhook-dialogflow.glitch.me/webhook/), and generate an Access and Verify token.
In Dialogflow, grab the V1 client access token.
You now have everything you need to connect Dialogflow and Hootsy.
In the sample code, open the .env file.
Set the following:
H_ACCESS_TOKEN and H_VERIFY_TOKEN to the values in Hootsy.
DIALOGFLOW_V1_TOKEN to the value in Dialogflow.
H_SEND_URL to ‘https://ws.hootsy.com/api/send’.
Thats it! You should now be able to go back to the assistant in Hootsy and interact with it.
If you run into any issues, let us know – email@example.com.
This is a step-by-step walkthrough with the goal of providing enough detail that you can build this yourself, and maybe even improve upon it.
Why build this?
I’ve always had a love and fascination for Wikipedia. Many of the articles include corresponding location data. I thought it would be cool to be able to walk around and hold up your phone to something and have someone there to tell you more about it – even cooler when AR glasses are available and I don’t have to hold up my phone. This might encourage people to learn more about the world around them, and keep them more in the moment by not having to look down at their phones to get that information.
It would also be nice if I could look at a 3D model from the comfort of my home and get related info about it from Wikipedia.
Essentially, this is envisioning what Wikipedia could look like in augmented reality.
Also, I’m going to use an article about some artwork, because I’m interested in the experience of placing art on my wall and getting information about it. Artsy and Saatchi Art have apps that allow me to place art on my wall, but I can’t get information about the art while I’m looking at it in AR. Maybe this could even help artists sell more art, which would nice.
Tools and Tech
Here is what I plan to use to accomplish this:
Dialogflow – used to build out the conversation.
Wikipedia – for the content of the conversation.
Blender – to build the 3D art model.
Hootsy – to create the voice interactive assistant.
Unity – to deploy the experience to an iOS and Android app for the AR experience.
If you’re new to building conversations (aka chatbots) it may help to read through some of the Dialogflow documentation or YouTube videos, but I’ll try to walkthrough it here. Select the Default Welcome Intent. This is the response to someone saying hello or a similar variation of it. Change the Response of this intent. Delete what is there and add something new. Here is mine:
Save and return to the list of intents. If you hover over the welcome intent, you’ll see an option to add follow-up intent. Select that and then ‘yes’. Open up that intent and scroll to the bottom again to change the response. This time, I’m going to create a response that displays buttons. Select the ‘+’ next to the default tab and then select Facebook Messenger. Hootsy is setup to handle a similar message structure to Facebook, so this will work fine. Select Add response and then Quick Replies. Define the text field and buttons. The benefit of displaying buttons is that it helps guide the conversation so that user better knows what they can ask. In this case, I want the assistant to indicate what they know and to create buttons that directly map to sections of the Wikipedia article. Here is the wikipedia article I’ll be using https://en.wikipedia.org/wiki/Drowning_Girl and here is the Dialogflow response.
You may also want to create a follow-up intent for the user saying ‘no’ when asked if they want to know more about the art.
Next, we’re going to create intents that correspond to each of these quick replies. I’ll show the process for one.
Create a new intent and call it Summary. Define the training phrases which are what the user could say that would trigger the assistant to respond with the wikipedia summary. Dialogflow handles the Natural Language Processing (NLP) so it’ll determine if something the user says closely responds to one of these training phrases. Dialogflow provides analytics to help you understand how people are interacting and what training phrases you should add or update to improve the experience. For now, we’ll start with the following:
I generally try to keep the responses very short, but since this is intended to be more educational, I’ll experiment with providing the entire first paragraph of the summary followed by a question, ‘Do you want to know more?’.
If they say yes, then the assistant responds with the second paragraph of text. I can do this by adding a follow-up intent similar to what we did with the welcome intent.
Continue this process for all quick replies.
To make the assistant feel smarter, I’m going to do a few more things. First, I’m going to turn on Dialogflow’s ‘Small Talk’. Next, I’m going to create a few more intents to handle common questions about the artwork like size, materials, location, etc.
Next, I’m going to create a few more intents to handle common questions about the artwork like size, materials, location, etc. I’m restricting myself to only the information available on Wikipedia.
You can test out the conversation in the right hand side of the Dialogflow site, before integrating with Hootsy.
Creating the 3D Model
These are the steps to create the 3D artwork. I’m going to use Blender v2.8 (blender.org) since it’s free and I like the v2.8 interface as much as any of the paid 3D modeling programs.
For those new to Blender or 3D modeling in general, the tools can be challenging to get used to so here is the final model if you want to skip this section. I do recommend trying it out though as it’s a lot of fun once you get used to it.
I’m going to adjust the default cube to be 1m x 1m x 0.03m. Wikipedia provides the artwork dimensions and this size will allow me to easily scale height and width to match any artwork.
Next, I’m going to select the edges of the square facing us. With edges selected, hit ctrl+E and then select Mark Seam in the menu that pops up. Next hit ‘U’ to bring up menu and then select unwrap. Here is a view of the unwrapped UV.
For some reason, the texture ends up rotated 180 degrees when it’s added in a later step so we’ll fix that now. In the UV Editor view (bottom window in image above), hit the ‘a’ key to select all and then ‘r’ key to rotate. Type ‘180’ to rotate 180 degrees.
Now we’re going to add the materials and textures. First I’m going to change the base color of the cube to be black.
Then I’m going to create a new material, select the face of the cube that we marked seams for and then hit Assign (make sure you’re in Edit Mode if you don’t see the assign button).
Switch to Shader Editor as seen in the image below. Download the artwork image from the wikipedia article https://en.wikipedia.org/wiki/File:Roy_Lichtenstein_Drowning_Girl.jpg. Drop it into Shader Editor window and connect the color of that node to the Base Color of the Principled BSDF node. You should now see the image displayed on the model. If not, ensure you’re in the LookDev view to view texture.
We are now ready to export. We’re going to export in two formats: GTLF and FBX. GLTF is my preferred format for web and FBX imports best in Unity (although .blend files sometimes work too). Click File in the top menu, then export, then gltf (2.0). You can test the result by dragging the exported files here https://gltf-viewer.donmccurdy.com/. One thing you’ll notice is that the image quality of the artwork is less than ideal. We would likely want to improve this with higher quality images.
For exporting to FBX, set Apply Scale to FBX All and select Mesh to only export the mesh.
Creating the Voice Interactive Assistant
This is where we pull everything together. In Hootsy (https://hootsy.com), create a new assistant. I’m going to use one of the pre-built characters for simplicity, but you could build your own using Blender or another tool if you wanted. I’m also going to change the appearance for fun so it looks more like the girl in the artwork.
Next, I’m going to expand Conversation and change source to Dialogflow. I’m going to open up Dialogflow again, go to settings and then switch API version to v1 and copy the client access token (note: v1 will be deprecated later this year, but we plan to update to v2 before then). Go back to Hootsy, and paste this token into the Dialogflow access token field. You should now be able to interact with the assistant using that conversation.
Click add button next to Conversation Models and drag the recently exported Blender model. Select Display on Scene Load and then adjust the position/rotation as desired.
If you want to share this with others, select the Share button at the top of the left menu and copy the url. Note that the safari in-app browser does not currently provide mic access (see bug), so we provide a note directing people to view it in safari proper in those cases. Here is the experience I created: Drowning Girl Scene.
When you say ‘hello’, you’ll notice she waves to you. This was triggered through a custom message in Dialogflow. In Hootsy, you can see an animation defined for the assistant called ‘Waving’, which is triggered on request. In Dialogflow, I can trigger this through a custom payload as part of the welcome response. You can find more info on custom payloads in the Hootsy dev docs.
This web experience was built with Three.js. If you want to further customize this experience or deploy it within another Three.js experience, let us know. We plan to release the web SDK soon.
Deploying to Native App for AR
Quality AR experiences are not possible today on the web. There is progress towards making this possible, which you can learn about here, but for the near term, we’ll need to deploy to a native app to take full advantage of the AR capabilities with ARKit (for iOS) and ARCore (for Android).
For this we’ll use Unity (a popular game engine) and the Hootsy Unity SDK. Please refer to the SDK readme for initial setup. After initial setup, it should look like this.
Create a new folder under assets and call it ‘Resources’, Create a new folder under Resources and call it ‘Art’. Drag the FBX model and image into the Art folder. Select the model, and Extract Materials.
Select Material.002 and change Rendering Mode to Opaque. We will be focused on the AR scene, but we’ll open the VR example scene to more easily setup the art model. In the VR scene, create an empty object and call it the same name as the GLTF model name, which is likely the same name you used for the FBX model. Now drag the FBX model so it’s a child of the new object. For the child object, set Y rotation to 90, scale to 1.7 and cast shadows to off. Drag the parent object into the Art folder to create a prefab.
Open the AR example scene, select AR Session Origin and change Detection flags to Everything. Make sure you defined the Access Token and Assistant ID as detailed in the SDK readme. You should now be ready to build.
Go to File -> Build Settings. I’m going to build for iOS at this point, but you could also build for Android as we used Unity’s AR Foundation for multi-platform support.
Select iOS, select Switch Platform, and then select Build. Open the project in Xcode, define your Team, connect your iPhone or iPad and then run. When you run the AR scene, you’ll want to look around the floor until you find a ground plane and then tap on it to place the assistant and then find a plane on the wall and tap it to place the art. You could improve the experience of placing the assistant and artwork by using some of the assets in Unity’s AR Foundation UX example. You could also consider attaching the assistant to the screen so the only object that needs to be placed is the artwork.
Here are some of the ways this could be improved further:
Scale to all articles: Extract wikipedia data using Mediawiki API. Example query. You could add a fulfillment to your conversation to retrieve this information before responding to a user. As part of this you could:
Group content so a single assistant can cover multiple articles. For example, the assistant can show and describe all works by Roy Lichtenstein.
Location aware: Request user’s location and use it to retrieve relevant content.
Image Recognition: Allow user to take an image of something and use image recognition to classify and find relevant content.
Improve Artwork Placement: Currently using vertical plane detection in AR, but if the wall is empty it’s sometimes difficult to find a plane. The touch control script could also be updated to allow movement along a vertical plane after placement.
Shorter, better responses: Determine better ways to extract relevant content from wikipedia articles and provide responses to users.
If you made it this far, you rock! I’d love to know what you think. Please leave a comment or share with anyone who might find this interesting.
We’ve built the best voice interactive experience on the web. This has been used primarily for improving virtual and augmented reality experiences. However, many people have asked us, ‘Can we add this voice interaction to our website?’.
People may visit your website and struggle to find the information and support they need. It’s difficult to predict every question a person might ask and it’s difficult to know where to place that information on your website so people can easily find it.
If it’s a restaurant or local business, the best case scenario is that they call the business, although this takes up employee time that could be focused elsewhere. Most often though, the potential customer simply moves on and looks somewhere else.
If it’s an online business or startup, it’s even more likely that they will simply move on and look somewhere else.
There are companies like Intercom that make it easier to add more personalized support to websites. However, Intercom doesn’t provide voice interactive capabilities and the speed and ease that those capabilities provide. Also, for those who are more familiar with what we’re doing in virtual and augmented reality, these same experiences will be transferrable to these other mediums. This will help future-proof your company as VR and AR become more ubiquitous.
Just like Amazon’s Alexa makes it easy to play music or ask for the news, it should be just as quick and easy for website visitors to get the information and support they need.
A small button is fixed to the lower right corner of your website. The color of the icon will be customizable and there will be a fallback to text based interactions as needed. Here is an example:
When people tap on the button, they can interact with a virtual assistant and ask for whatever they need. If the assistant can’t answer the question, we’ll direct them to ways to contact you, either via live chat with an employee/agent or other means. Over time, you’ll gain insight into what your customers want to know and the assistant will get smarter and better at assisting your customers.
If you’re interested in adding this to your website, you can submit a request for a free website evaluation here.
To stay in the present moment, we need to be looking up instead of down at a screen. We’ll use this image as a starting point.
Like many things in life, it’s beautiful. We don’t need to augment it with images or 3D models to make it more impressive. However, let’s say you want to know something about this place. Maybe it’s busy inside or it’s not open yet and you want to see what kind of baking classes they offer. Today, you would look down at your phone and open an app like Yelp, Google Maps, or try to find the business’s website or social media profile. This pulls you away from the moment, so how can we prevent this?
Augmented Reality (AR) allows us to stay looking up, especially when AR glasses are released in the near future. I don’t want to view a 2D screen overlaid in the real world as that would be no better than what I could do with my phone. I want to talk to someone or something about what I need, so let’s imagine something that looks more human and that I can talk to and get the info I want. We’ll call this the assistant. It should blend in with the scene, but also be noticeable if I’m looking for it. Here is a female assistant waving to us.
The assistant should only start listening when I look at her. I should also be able to easily control when she is listening. For this, we’ll need to add a mic button, but it only needs to display when we’re interacting with the assistant. For now, it’ll be attached to the screen (bottom-middle), but in the future we may be able to use some control on the device or other approach that doesn’t require a separate button in our view.
That’s it. That should be all we need. I should be able to walk-up, ask the assistant for info about the classes and then move-on. Never having to look down.
There are some additional use cases that we would want to cover, but which should hold to same requirement of limiting how much displays in our view. One use case is that we want to get info from the assistant, but we don’t know exactly what she knows and we’re not exactly sure how to ask it. You’ve probably experienced this frustration when asking Alexa or Google Home for something. For this, we can display buttons next the assistant (in 3D space) to help guide the conversation.
There are also cases where I might want to talk to this assistant as I walk away, or I might prefer not having any 3D models displayed in my world, or I want an assistant always available to help regardless of where I am. For this, we will attach the assistant to screen so it always displays in our view (bottom-left) and have it be easily removed from view when desired.
Finally, as a developer or business, I want to control the type of experience that I provide to potential customers so that it’s unique, memorable, and aligns with my brand. It would be quite boring to end up in a world where we only interact with a Google, Amazon, or Apple assistant, or for these companies to control the type of interactions we can create and try to push us towards their walled-gardens. For this reason, we’re putting a lot of focus on allowing businesses and developers to fully control the appearance and interactions for these assistants, and to have them work across all platforms, native and web.
When I was a child, I remember a summer day seeming to last forever. I could do and see so much in one day. It was beautiful. Now as I grow older, time seems to be speeding up. This becomes more apparent after having a child. Today she is walking and talking and yet it feels like yesterday she was born. I fear if I blink too hard she’ll be graduating high school and headed off to college.
It forces me to look carefully at my time. The idea of living until I’m 90 felt like an eternity when I was 12. Now that I’m suddenly in my 30s, it doesn’t seem so far away.
So I ask myself questions that I believe we should all ask. Am I fully present in the moment? Am I creating things that make the world a better place?
To be honest, I am very rarely present in the moment. I am addicted to my phone, which is only easy to say because I know you, the reader, likely are as well. The level of addiction wasn’t as obvious until I ran some experiments. One was to place the phone on a table and just stare at it for 5 minutes without picking it up. I failed many times with many excuses like ‘this is a stupid experiment’ and ‘this doesn’t count, because I just need to check this one thing’.
At times when I’m with my child and should be taking in the moment, I find myself, almost without thought, unlocking my phone to check on something. I’ll stay focused on the screen until she pulls on me, seeming to say, ‘Stop, please!’
Cigarettes are obviously bad. There is study after study showing its direct link to cancer and other health issues. Even then, it took many years for the facts to change people’s habits. Technology in itself isn’t bad, but the apps on it and what they take from us often are. They ask for our time with very rarely anything positive in return.
As someone who builds web and mobile apps for a living, I am acutely aware of all the techniques that apps use to absorb our time. The ways in which they hook us with little dopamine hits that keep us coming back over and over and over again. Despite knowing this, I still struggle to change my habits. I’m now convinced that as a society there is no going back. The time before cell phones and social media is a time of the past. It’s only going to permeate society more.
It’s already at the point where we feel left behind if we don’t know how to use it. Recently, a family member called crying because she couldn’t figure out how to text her friends and felt she was losing them as a result. While you may be more technically savvy than this, you can probably relate to that feeling of anxiety with having to figure out some new device or app.
Since we can’t escape technology, I believe the only answer is to dive in deeper, but do so conscientiously. Essentially, we need to make technology invisible. It’s there when we need it, but it’s not constantly trying to pull us away and we don’t have to figure out how to use it.
Now is a particularly interesting time as augmented reality starts to grow in popularity. We’ll soon have AR glasses that create a more intimate relationship with technology. It could be a beautiful thing where AR truly enhances our reality by serving our needs quicker and easier than today’s cell phones…. or it could be a terrible thing that serves as a greater distraction and time sink.
I believe we can make it beautiful, and that is what drives me now. It is why our mission at Hootsy is to make interacting with technology feel as natural as talking to another human. We want to make the interface fade away.
Technology is there to serve us and while it might get more complex over time, how we interact with it should get easier. We should be able to simply say what we need, just like if we were talking to another human, and then quickly move on with our lives – never having to look down or feel something trying to pull us away from the current moment.