When will smartphones be intelligent?
Many tech businesses these days are investing heavily in developing artificial intelligence, and with good reason. Artificial intelligence can potentially evolve much more quickly than human intelligence and therefore find solutions to problems much more efficiently. We’re not quite there yet though. In fact it may be a few decades before we get any artificial intelligent constructs that can match the general intelligence of humans… well, some humans. Right now we’ve got some basic constructs performing narrow intelligence tasks. Which is to say, they can solve very specific problems.
For most of the age of computers, calculations were solved using rules that were pre-programmed into the computer. We started with basic math, then moved up from there. Even today smartphone graphic user interfaces are still hand coded around rules. “If user taps this button, then do this.” That works okay, but it’s not really intelligent.
Smartphones are kind of dumb
In fact, many apps and mobile operating systems are hard coded in a way that isn’t smart or intelligent. All smartphone operating systems have a top-edge-screen gesture as a primary interaction method, but this really makes no sense at all on an phone screen size larger than 3.8″. There are lots of ways that smartphone UI designers are failing at making smart interaction methods on phones. In many cases they’re building on concepts that may have made sense in the past, but often no longer do… or only make sense for a narrow portion of the population.
We’ve started to see some semblance of “Narrow Intelligence” coming to our devices though. Narrow intelligence only looks at one specific aspect of a problem and attempts to solve it using a series of algorithms and the data it has access to. The more data the narrow AI is given, the more accurate its answer will be. Siri, Cortana, Alexa, Google Assistant, Bixby, etc. are all examples of narrow intelligence. They have a specific skillset that involves voice recognition and narrow responses. None of those assistants are really able to grow or significantly learn from you as a user just yet. Though I’m sure they are collecting data. There are some minor learning aspects though, such as learning your route to work and being able to warn of traffic problems ahead of time… or learning what kind of news you like… or reading your email and learning to notify you about package delivery information, etc.
Speech based intelligent user interfaces are probably the most exciting for future interactive methods since these can be used without hand & eye movement. At least they could if they were designed properly. Currently, many of these speech UI assistants require you to look at the screen at many points and that’s a huge broken aspect of the experience. These speech based user interfaces still don’t communicate notifications to the user when relevant without vague unintelligible sound effects. I can ask Siri to read my email on demand, but I can’t ask her to read new emails from Anton Nagy out loud to me as soon as one is received… which would be way more useful. There isn’t even an option to program that kind of rule into Siri (or any of the speech UI assistants) myself. Although there used to be a way to program an SMS notification for emails from specific contacts in a T-Mobile push email service, and today there’s a way to do the same thing via Microsoft Flow & Office 365, but that’s not as intelligent as it could be and SMS messages only get automatically read aloud on Windows Phone. Also, why would I want to be notified of an electronic text-based message with another electronic text-based message?
Still, Microsoft Flow and IFTTT cross-service scripting methods are another form of narrow artificial intelligence where you can teach the software to do specific tasks that you want to get done based on specific criteria. The problem is that these rule-creation services don’t integrate very will with the phone’s system software or any of the speech UI virtual assistant systems. I can’t say, “Hey Cortana, let me know when the document I just sent is signed by the client” even though that’s a simple applet/template that I could set up in Microsoft Flow. With Android however, I can make IFTTT rules that extend the capabilities of the “Ok, Google” speech assistant, and that’s a step in the right direction… but I can’t say something like, “Ok, Google. Until tomorrow morning, reply to all text messages with I’m busy right now.”
Chatbots are another form of narrow intelligence that live inside various instant messenger or dedicated software programs. These are similar to the speech interface assistants except they respond to basic commands and keywords within an electronic text chat UI. Sometimes they can generate buttons that you can press to answer their questions instead of having to type responses, but in general these are still very limited. What’s worse is that these chatbots each probably have one set of functions that the others do not and you have to specifically choose which one to talk to in order to get the right answers or tasks completed. Finding the proper chatbot agent to talk to often takes up way too much cognitive energy. Really I need one chatbot that understands everything and has access to all of my installed apps/services.
There are also a few Email-based narrow artificial intelligence systems out there too. [email protected] and [email protected] are two AI systems that will receive an email from you where you’ve also included a number of other people in the message, and it will subsequently contact each other person included on the original email to communicate with them individually and find a time that works for everyone for a specific meeting or meet up. This is a much smarter system than chatbots in my opinion, since there’s no need to install a specific app. In fact, I believe all of Cortana’s system-wide functions should be integrated into an email address access point (currently the calendar.help address can only process meeting requests.) Actually, Cortana was just added to Skype as a chatbot as well, so all she needs now is a telephone, SMS, and email interface.
All of those narrow artificial intelligence constructs still require humans to adapt and learn what specific phrases or commands those systems will correctly respond to. Although, to be fair, the same is often true with human to human interactions.
Intelligent graphical UI’s
Many companies started implementing these chat bots after some research surfaced that showed how the largest app usage for users happened in instant messenger type chat apps. That may be partly true, but the actual most-used human-computer-interaction method is going to be the graphical user interface. You need to press that button on your home screen to launch that chat app every time after all. Chatbots and speech interface assistants are fine if you feel like typing stuff or talking to a computer in very specific ways, but a graphical user interface that happens to have the button you need right there instantly visible and within your reach is often much faster and more efficient. What are the chances that the app designer made the app in a way that suits your needs precisely though? I’d say chances are slim, especially if you’re a power user. Smartphone apps are designed in a way that you need to adapt to them instead of allowing them to adapt to the user… and that’s the opposite of an intelligent system.
In the old days of computing, we could do this with manual customization. Many professional grade programs that you’ll find on desktop operating systems offer fully customizable user interfaces. I can create toolbars and palettes and keyboard shortcut combinations that make working with the software much more efficient for the tasks I need to do. I can write scripts that actually add new menu items and functions to some programs. Even professional grade hardware like Wacom’s Mobile Studio Pro offers huge amounts of customization with programmable tactile hardware-based controls. What’s more, every user may have a different need that he/she can make more efficient with some customization and a little human intelligence.
Today the amount of UI customization controls in most smartphones and smartphone apps is very low. In iOS, you can arrange icons into folders and change background images on the launcher, maybe add some widgets to the widget screen or some icons to the control center, but that’s about it. In Android, you can install a completely different app launcher, arrange widgets on the home screen, maybe change some icon designs, but you won’t be able to fix Snapchat’s terrible UI or change the awful colors in the Gmail app. You won’t be able to hide obtrusive in-app buttons that you never use and replace them with functions that you do use. You can’t program your own touch gestures to quickly perform specific functions in specific apps (though you can make some custom system-wide gestures). It’s really not as power-user-friendly as I would hope, and it’s certainly not as personal as it could be.
It didn’t used to be so bad either. In the old Windows Mobile days around the turn of the century, many smartphones had hardware keys that could be programmed to perform certain functions in certain apps. I could customize my device for eyes-free control of the media player and GPS navigation software while driving. That was much safer and efficient than the touch-screen controls we have today. See “Customizable buttons could usher in smartphone interface nirvana” and “Challenges in Developing User-Adaptive Intelligent User Interfaces (PDF)” by Melanie Hartmann. Back in 2009, Vinnie Brown and Probex made a completely re-designed “Naughty by Nature” version of HTC’s UI on a Touch Pro 2. On the HTC Touch Diamond, I was even able to program which on-screen keyboard was available based on whether the stylus was in its silo (touch-friendly T9 keyboard) or if it was removed from its silo (character recognizer or FITALY). Having a customizable UI allows me to essentially “teach” my device about the way I want it to work and it will be modified to allow me to do those tasks more efficiently. That’s what intelligence is all about.
Granted, I certainly needed to have some intelligence myself in order to understand what modifications needed to be made in the software/hardware in order to increase efficiency, so that’s where we can improve things further with software that can anticipate the user’s needs by learning directly from the user and notifying the user of a change that could make things easier.
Microsoft actually had an intelligent user interface as part of Office around the turn of the century. “Personalized menus” would monitor which commands you used most often and eventually it would make those more accessible by hiding the menu items that you used least often. This essentially simplified a complex interface in a way that suited the user directly. It also increased efficiency by reducing the amount of mouse travel required to access most-used functions. It may have been ahead of its time though as many users wanted the feature turned off. I guess sometimes it’s important to show the full complexity of the application at all times, which is why auto-personalization intelligence should always be an opt-in feature. That’s also why we need artificial intelligence to learn about the users’ preferences.
A personalized user interface makes a lot more sense on small-screen devices like smartphones where it’s simply not possible to see all of the functions at the same time due to the limited screen area. Those Bold, Italic, Underline, & Font Color buttons in Microsoft Word’s mobile app are a huge obstruction to what I really need to do which is select the proper Style name. In fact, nobody should ever use those buttons in Word, as linked customizable styles are a far more intelligent and efficient tool for word processing. The delete, previous, next buttons in Outlook Mobile are taking up valuable screen real-estate for functions that I never use! What I really need are “Flag with reminder”, reply, and reply all buttons right there. Maybe other people do use those previous/next buttons… I don’t know and I don’t care. The UI is clearly inefficient for me.
An intelligent user interface should increase my efficiency not degrade it.
Many mobile app designers design their user interfaces based on whatever they feel like doing. That often means that you get buttons and icons and organizational structures that only the developer understands. If you’re lucky, you might get some apps designed for the average basic user based on some user-testing data, which is a lot better than designing based on no data, but it’s still a lot worse than an intelligent design personalized for each user or use-case scenario. Designing for the average user is going to frustrate and lose both the beginner users and the power users who need something that’s either easier to learn or more efficient to use.
I’m intelligent enough to modify my desktop Outlook email program to decrease cognitive energy usage by automatically color coding emails and appointments based on specific keywords and importance. I’m intelligent enough to program the hardware buttons on my Wacom hardware to decrease muscle energy usage and develop muscle memory for specific tasks. I’m intelligent enough to modify keyboard shortcuts so that I can access frequent commands with only a couple of fingers instead of larger hand movements. Unfortunately many software programs especially on mobile platforms simply aren’t capable of accommodating my intelligence. (And I’m not intelligent enough to program my own software or mobile operating system.)
The way it should work is that all apps would be mandated to have a system-wide consistent customization interface that includes usage data collection. So I would launch a new app and it would present a default average user interface where the buttons have text labels so that I can instantly understand what they do. After a certain number of app launches and some usage data collection, the intelligent GUI should say something like, “I’ve noticed that you’re using specific features more than others, would you like me to make those features easier to access for you? You can always switch back to the default or manually change the features you use most by tapping Menu > Customize” The intelligent personalization feature would also allow for collapsing certain buttons into smaller icons so you can fit more functions on the screen if desired.
The problem with customized GUI’s in the past has been the inability for tech support to know how a user’s GUI has been customized, but a “switch to default UI” button should easily solve that.
Of course, an intelligent graphical user interface should be completely integrated with the intelligent speech interface and intelligent text chat interface. If all apps were designed to integrate with the phone’s artificial intelligence, I should be able to use the speech interface to change the theme of the entire system. Currently, no apps on iOS or Android follow a system-wide theme structure. They’re all different with different colors and different icon styles… and that makes for a broken inconsistent mess of a user experience. Windows does have a theme interface on desktop and mobile where apps can pick up accent colors, background colors and such, but it’s not nearly as robust as it should be (and even Microsoft itself fails to follow that structure).
UPDATE: Since I started writing this article weeks ago, there has been some speculation that Facebook may be experimenting with an intelligent GUI after all.
Is Facebook’s bottom navigation bar adaptive (based on individual use) and not an ongoing series of A/B tests? https://t.co/0y6C2WMHf7
— Luke Wroblewski (@lukew) October 12, 2017
General and Super Artificial Intelligence
On the other hand, if we’re able to design artificial intelligence that evolves fast enough to reach the general intelligence level, maybe we won’t need to interact with computers anymore at all. Maybe we’ll be on permanent vacations! General Intelligence is the term used to describe an AI construct’s level of intelligence as matching that of a human. I’m not sure which human we’re talking about though since we have extremely variable degrees of intelligence, but the theory is that someday AI will be as good as us humans.
Google thinks their Google Assistant is about as intelligent as a 6 year old human, but I don’t believe that’s true at all. A 6 year old can learn from the people it interacts with.
General Artificial Intelligence probably won’t happen for 20-30 more years (if it keeps progressing forward which is not necessarily true with other computing systems) so I’m pretty sure we’ll still need to interact with and augment narrow artificial intelligence for a while yet. Hence, the need for more intelligent graphical, textual and speech user interfaces.
Evolution and natural intelligence used to be the main way that life progressed on Earth. That eventually created human intelligence, which was able to evolve much faster than natural intelligence. We’re able to create things that would probably never be possible in the usual scheme of random evolution and natural selection… and we’re able to do it very quickly due to our ability to share knowledge. But we as humans do forget things sometimes. We don’t really know how to build those massive stone pyramids anymore. We’ve kind of forgotten how to put humans on the moon. (Space X still has embarrassing newbie problems with fueling rockets.) We’ve forgotten how to design efficient human-computer-interfaces that don’t waste cognitive energy and muscle movement.
Super Artificial Intelligence is the next step and it should greatly improve upon the efficiency of evolution provided it can keep the shared knowledge going much more accurately than we have, but that’s not going to happen if we don’t get the foundation right.
Anyway, is it time for the smartphone to get intelligent and be able to learn how to make things much easier for you? Or are we going to be stuck with poking icons haphazardly arranged on a screen like we’ve been doing since the GUI was invented?