Sofa 5.0
Previously, On MacStories
App Debuts
Interesting Links
Hour by Hour: Reverse Engineering Your Schedule
Hour by Hour is a clever new approach to scheduling your time from Joe Humfrey of Selkie Design that took me a little while to get used to, but has really grown on me.
The app was inspired by travel planning and the age-old question, “When should I leave for the airport?” You’ve probably been there before. You have a flight at, say, 2:00 pm, but you need to drive 30 minutes to the airport, add some time to park, take a shuttle to the terminal, get through security, and build in a little extra wiggle room just in case traffic is bad or something else goes sideways. Suddenly, 2:00 pm becomes an exercise in mental gymnastics as you work your way back to when you should walk out the door.
Hour by Hour solves this sort of scheduling, but for every type of event, by using the same kind of reverse planning. At the same time, it’s not really a calendar app so much as a scheduling companion for your calendar. You can pull your calendar events into Hour by Hour, but you don’t have to, and if you dive into the app expecting to use it the same way you use a traditional calendar, the assumptions you bring with you will probably trip you up.
Claude Mythos Preview Will Only Secure Part of the Internet
Yesterday, Anthropic announced Claude Mythos Preview, a new general-purpose model that it says is exceptionally good at finding security vulnerabilities in code. In fact, the model is so good that Anthropic has decided not to release Mythos Preview to the general public. Instead, it’s being released to a select group of companies that control OSes and other critical software.
Anthropic found thousands of vulnerabilities across every major OS and web browser with Mythos Preview, but used these three examples to illustrate their severity:
- Mythos Preview found a 27-year-old vulnerability in OpenBSD—which has a reputation as one of the most security-hardened operating systems in the world and is used to run firewalls and other critical infrastructure. The vulnerability allowed an attacker to remotely crash any machine running the operating system just by connecting to it;
- It also discovered a 16-year-old vulnerability in FFmpeg—which is used by innumerable pieces of software to encode and decode video—in a line of code that automated testing tools had hit five million times without ever catching the problem;
- The model autonomously found and chained together several vulnerabilities in the Linux kernel—the software that runs most of the world’s servers—to allow an attacker to escalate from ordinary user access to complete control of the machine.
A lengthy Frontier Red Team report brings the receipts for security researchers with an in-depth look at what Mythos Preview uncovered and the step change that the new model represents over Opus 4.6:
For example, Opus 4.6 turned the vulnerabilities it had found in Mozilla’s Firefox 147 JavaScript engine—all patched in Firefox 148—into JavaScript shell exploits only two times out of several hundred attempts. We re-ran this experiment as a benchmark for Mythos Preview, which developed working exploits 181 times, and achieved register control on 29 more.
As part of a test, Mythos Preview also managed to escape its sandboxed environment, message the researcher conducing the test, and then, outside the parameters of the test, posted about the exploit online.
The idea behind Project Glasswing, whose participants include Amazon Web Services, Anthropic, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks, is to give them a head start at securing their systems before similar models emerge and are exploited for cyberattacks. If Mythos Preview’s capabilities are as Anthropic makes them out to be, this seems like the right approach. However, I do worry that with time, it could lead to a two-tier Internet where big tech companies operate in relative security thanks to tools like Mythos Preview, while those without access are left to swim with the sharks.
How Can Everybody Hate Their Weather App When There Are So Many Great Choices?→
Twitter clients may have been a design playground in the early days of the App Store, but it’s weather apps that have carried the torch. That’s because the developers of weather apps have to simultaneously contend with a vast amount of data and a wide variety of user preferences.
Last week, for The New Yorker, Kyle Chayka profiled Acme Weather, the new weather app from the team behind Dark Sky that I recently reviewed.
The problem with weather apps, as Brian Mueller, who was interviewed for the story, puts it is that:
“Everybody wants their own weather app,” Mueller told me. An Angeleno may care more about air quality, for instance, whereas a Bostonian wants to know the chance of snow. Carrot’s imperfect solution is to allow users to customize their own display, choosing which information to foreground, against a backdrop of chaotic animations and snarky jokes (“The temperature is low, but my disdain for you is even lower”). Hello Weather separates various stats—on UV or wind—into separate onscreen tiles. Acme’s answer, the most elegant of the three, is to show a minimum of information based on what matters most in a given moment.
Chayka clearly prefers Acme’s approach, which overlays weather predictions from multiple forecasters accompanied by a short narrative summary. I like Acme Weather’s, too, but Chayka was too quick to dismiss CARROT Weather and Hello Weather’s approaches. The fact that all three, plus other top tier weather apps like Mercury Weather can co-exist proves Mueller’s point that everyone wants their own weather app. I’d argue the real problem is that most users haven’t found the right weather app for them or aren’t willing to pay for a better one. Acme Weather is an excellent app, but it’s just one among many great choices, and as users, we’re fortunate that there’s room for all of them.
Roadtripping with ChatGPT Voice Mode
On Saturday, my wife Jennifer and I drove to Blowing Rock, a quaint little town in the Blue Ridge Mountains. We’d been there once before, but didn’t know the town well, so as we headed west I poked at the ChatGPT icon on my dashboard to give the app’s new CarPlay integration a try. I asked:
What activities would you recommend for a day trip to Blowing Rock, North Carolina?
What I got back was a short but good list of highlights including a hike, a visit to the Blowing Rock cliffside overlook, a few restaurants, a coffee shop, and some local shops. It was similar to a list of activities I’d looked up before we left using Claude. So far, so good.
I switched back to Apple Maps and was thinking I probably wouldn’t use ChatGPT in my car very often, but that it could come in handy for similar requests, when things got a little creepy. I explained to Jennifer that ChatGPT’s CarPlay feature was new, and I had been meaning to check it out all week. Then, just as I’d said I thought it had done a pretty good job, a voice interrupted. It was ChatGPT’s voice mode saying it was glad I liked it.
You see, just like a phone call doesn’t drop when you switch apps in CarPlay, neither does ChatGPT. I supposed I should have anticipated that the mic would remain live, but I didn’t. Nor did I notice the End button in the corner of the screen; I was driving, not studying the app’s UI.
I take it as a positive sign that I didn’t expect ChatGPT to follow me back to Apple Maps. I treat chatbots like I do any app. Give it some input, and you get an output. Close the app, and you’re done. It’s not my little robot buddy. It’s a tool like any other app.
Of course, that’s not how the voice modes of these chatbots are designed to work. Chats are meant to be an engaging back and forth. But having ChatGPT jump in on our one-on-one conversation while driving down the highway was too much. Suddenly, it felt like something else was in the car eavesdropping on us.
The experience was a good lesson in the balancing of utility and social norms around AI tools. Useful as they can be in some situations, their developers need to be more mindful of user expectations and provide better cues about how they work to avoid uncomfortable surprises. The recommendations we got from ChatGPT were good, but I also don’t expect it will get a second chance on our family road trips anytime soon.
