Azthena, a latest AI-powered search and information retrieval system built by AZoBuild.com, launched this week in limited beta. It aims to streamline access to complex building and construction data, leveraging OpenAI’s LLMs. However, the launch is shadowed by transparency concerns regarding data handling – specifically, sharing user queries with OpenAI and a 30-day data retention policy – and the inherent risks of relying on a closed-source AI for critical industry information.
The Architecture Beneath the Surface: LLM Parameter Scaling and Data Provenance
AZoBuild’s decision to integrate an LLM isn’t surprising. The construction industry is drowning in unstructured data – blueprints, material specifications, regulatory documents, and project reports. Traditional search methods struggle with this complexity. Azthena, at its core, is a retrieval-augmented generation (RAG) system. This means it doesn’t simply *realize* the answers; it searches a curated knowledge base (presumably AZoBuild’s extensive library of building information) and then uses an LLM to synthesize a response. The critical question, and one AZoBuild hasn’t fully addressed, is *which* LLM and at what scale. People can infer, given the partnership with OpenAI, that GPT-3.5 or GPT-4 are likely candidates. However, the performance of these models is heavily dependent on parameter scaling – the number of trainable parameters within the neural network. A smaller model will be faster and cheaper to run, but will likely produce less accurate and nuanced results. A larger model, like GPT-4 with its estimated 1.76 trillion parameters, offers superior performance but demands significantly more computational resources.
The real vulnerability lies in the data provenance. Where is AZoBuild sourcing its knowledge base? Is it relying solely on publicly available information, or is it incorporating proprietary data from manufacturers and suppliers? If the latter, the accuracy and impartiality of Azthena’s responses become highly suspect. The terms of service explicitly state that users should “confirm any data provided with the related suppliers or authors,” a tacit admission of potential inaccuracies. This isn’t a bug; it’s a feature of relying on a black-box AI trained on potentially biased or outdated data.
What This Means for Enterprise IT
For large architecture, engineering, and construction (AEC) firms, Azthena presents a potential productivity boost, but as well a significant risk. The convenience of a single search interface must be weighed against the potential for errors and the lack of control over the underlying data. Integration with existing Building Information Modeling (BIM) software – like Autodesk Revit or Graphisoft Archicad – is currently unclear. Without seamless integration, Azthena risks becoming another siloed information source, defeating the purpose of a unified digital workflow.

The Privacy Trade-Off: OpenAI Data Sharing and the Implications for Confidentiality
The most concerning aspect of Azthena’s launch is the explicit sharing of user queries with OpenAI. Whereas AZoBuild states that email details are not shared, the transmission of search terms – which could include sensitive project details, proprietary designs, or confidential client information – raises serious privacy concerns. OpenAI retains this data for 30 days, ostensibly for model improvement. However, this retention period creates a potential data breach risk and could violate confidentiality agreements. The terms also explicitly prohibit asking questions containing “sensitive or confidential information,” but this relies entirely on user compliance and doesn’t address the inherent risk of accidental disclosure.
This practice highlights a growing trend in the AI industry: the commodification of user data. Companies are increasingly willing to trade privacy for convenience, and performance. However, this trade-off is often made without adequate transparency or user consent. The EU’s General Data Protection Regulation (GDPR) and similar privacy laws around the world are attempting to address these concerns, but enforcement remains a challenge. The official GDPR website provides detailed information on data protection rights.
“The biggest risk isn’t necessarily the AI getting the answers wrong, it’s the data leakage. AEC firms are incredibly protective of their intellectual property. Sending project details to a third-party, even for ‘model improvement,’ is a non-starter for many.” – Dr. Anya Sharma, CTO of BuildSecure, a cybersecurity firm specializing in the AEC industry.
The Ecosystem Play: AZoBuild vs. Open-Source Alternatives
AZoBuild’s move positions it squarely against the growing open-source AI community. Projects like Llama 2 (Meta’s Llama 2) and Falcon (Technology Innovation Institute’s Falcon) offer viable alternatives to OpenAI’s closed-source models. These open-source LLMs allow organizations to host and customize the AI themselves, eliminating the privacy concerns associated with third-party data sharing. However, they require significant technical expertise and computational resources. The trade-off is control versus convenience.
The long-term success of Azthena will depend on AZoBuild’s ability to address these concerns. Offering users the option to opt-out of data sharing, providing greater transparency into the data sources used to train the AI, and integrating with open-source LLMs could significantly improve its appeal. Currently, it feels like a walled garden, leveraging the power of AI but at the cost of user privacy and control.
The 30-Second Verdict
Azthena is a promising concept hampered by questionable data practices. While the potential for streamlining building information access is significant, the privacy risks and lack of transparency are deal-breakers for many organizations. Wait for AZoBuild to address these concerns before considering adoption.

API Capabilities and Future Development
Currently, details regarding Azthena’s API are scarce. The ability to programmatically access the AI’s functionality would be crucial for integration with existing AEC workflows. Key API features would include: query submission, response parsing, data filtering, and access control. Without a robust API, Azthena will remain a standalone tool, limiting its impact. The pricing model for the API is also unknown. OpenAI charges based on token usage – the number of words processed by the LLM – and AZoBuild will likely adopt a similar approach. However, the cost of using Azthena could quickly escalate for organizations with high query volumes.
the lack of information regarding the AI’s ability to handle different data formats is concerning. Can it process PDFs, CAD files, and BIM models directly, or does it require data to be converted into a text-based format? The latter would introduce another layer of complexity and potential errors. The W3C’s Semantic Web standards offer a potential solution for standardizing data formats and improving interoperability between different systems.
“The AEC industry is notoriously slow to adopt new technologies. Trust is paramount. AZoBuild needs to demonstrate a commitment to data security and privacy before it can gain widespread acceptance.” – Ben Carter, Lead Developer at BIMTech Solutions.
>
The launch of Azthena is a microcosm of the broader AI revolution: immense potential coupled with significant risks. The future of building information management will undoubtedly be shaped by AI, but only if these risks are addressed proactively and transparently.